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(54) Image processing apparatus and method 

(57) MPEG-4 encoded data is input, and a shape 
code decoder decodes shape data contained in the en- 
coded image data to obtain ROI information contained 
in that image. The frequency transforms of the decoded 
image data are computed to generate transform coeffi- 
cients. 



A bit shift unit bit-shifts transform coefficients, corre- 
sponding to the ROI, of the generated transform coeffi- 
cients, to upper bit planes, stuffs "0"s in blank fields out- 
side the ROI, which are generated by the bit shift proc- 
ess, and stuffs audio data from an audio buffer in blank 
fields within the ROI, which are generated by the bit shift 
process. 
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Description 

FIELD OF THE INVENTION 

[0001] The present invention relates to an image 
processing apparatus and method for encoding/decod- 
ing data, and its computer program and storage medi- 
um. 

BACKGROUND OF THE INVENTION 

[0002] As a still image encoding scheme, JPEG is cur- 
rently prevalent. JPEG was standardized by ISO (Inter- 
national Organization for Standardization). As a moving 
image encoding scheme, Motion JPEG that exploits 
JPEG as intra-frame coding is known. Furthermore, as 
the Internet proliferates, coding that can assure higher 
functions and higher image quality than JPEG used so 
far is demanded. For this reason, ISO is laying down 
new still image coding standards. This activity is gener- 
ally called "JPEG2000". Refer to Toda, "Special Report 
JPEG2000 Explore Next Generation Image Technique", 
C MAGAZINE October 1999, pp. 6 - 10, for an outline 
of JPEG2000. An ROI (Region of Interest) in this report 
is a new function, and is a helpful technique. 
[0003] An image encoding apparatus that can imple- 
ment the ROI will be explained below with reference to 
Fig. 13. 

[0004] Referring to Fig. 13, reference numeral 1001 
denotes an image input unit; numeral 1 002 denotes a 
discrete wavelet transformer; numeral 1003 denotes a 
quantizer; numeral 1004 denotes an entropy encoder; 
numeral 1005 denotes a code output unit; and numeral 
1011 denotes a region designation unit. 
[0005] The image input unit 1001 outputs image data 
that form an image to be encoded in the raster scan or- 
der. The image signal output from the image input unit 
1001 is input to the discrete wavelet transformer 1002. 
The discrete wavelet transformer 1002 executes a two- 
dimensional wavelet transformation process for the in- 
put image signal, and computes and outputs transform 
coefficients. 

[0006] Fig. 14 shows an example of the configuration 
of transform coefficient groups of two levels obtained by 
the two-dimensional discrete wavelet transformation 
process. An image signal is decomposed into coefficient 
sequences HH1 , HL1 , LH1 , LL in different frequency 
bands. Note that these coefficient sequences will be re- 
ferred to as subbands hereinafter. The coefficients of the 
individual subbands are output to the quantizer 1 003. 
[0007] The region designation unit 1011 determines a 
region (ROI) to be decoded to have higher image quality 
than the surrounding portions in an image to be encod- 
ed, and generates mask information indicating coeffi- 
cients that belong to the ROI upon computing the dis- 
crete wavelet transforms of the image to be encoded. 
[0008] Fig. 15A shows an example of a mark informa- 
tion generation process. 



[0009] When a star-shaped region is designated in an 
image by a predetermined instruction input, as shown 
in the left image of Fig. 15A ? the region designation unit 
1011 computes those portions of respective subbands 

5 that include the designated region upon computing the 
discrete wavelet transforms of the image including this 
designated region. The region indicated by this mask 
information corresponds to a range including transform 
coefficients of the surrounding region required for recon- 

10 structing an image signal on the boundary of the desig- 
nated region. 

[0010] The right image of Fig. 1 5A shows an example 
of mask information computed in this way. In this exam- 
ple, mask information upon discrete wavelet transfor- 
ms mation of the left image in Fig. 15A is computed, as 
shown therein. In Fig. 15A, a star-shaped portion corre- 
sponds to the designated region, bits of the mask infor- 
mation corresponding to this designated region are set 
at "1 M , and other bits of the mask information are set at 
20 "0". Since the entire mask information has the same for- 
mat as transform coefficients of two-dimensional dis- 
crete wavelet transformation, whether or not a transform 
coefficient at a given position belongs to the designated 
region can be identified by checking the corresponding 
25 bit in the mask information. The mask information gen- 
erated in this manner is output to the quantizer 1003. 
[0011] The quantizer 1 003 quantizes the input coeffi- 
cients by a predetermined quantization step, and out- 
puts indices corresponding to the quantized values. The 
30 quantizer 1 003 changes quantization indices based on 
the mask information input from the region designation 
unit 1011 by: 

35 q' =qx2 8 ; (1) 

inside region 

40 q' = q ; < 2 > 

outside region 

[0012] With the aforementioned process, only quan- 
tization indices that belong to the designated region des- 
45 ignated by the region designation unit 1011 are shifted 
up (to the MSB side) by 8 bits. 

[0013] Figs. 15B and 15C show a change in quanti- 
zation indices by this shift-up process. Referring to Fig. 
1 5B : quantization indices are included in subbands, and 
50 change after the shift-up process, as shown in Fig. 15C. 
The quantization indices changed in this way are output 
to the entropy encoder 1 004. 

[0014] The entropy encoder 1 004 decomposes the in- 
put quantization indices into bit planes, executes binary 
55 arithmetic coding in units of bit planes, and outputs code 
streams. 

[0015] Fig. 1 6 is a view for explaining the operation of 
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the entropy encoder 1004. In this example, a 4 x 4 sub- 
band region includes three nonzero indices, which re- 
spectively have values M 4-13'\ "-6", and "+3". The entro- 
py encoder 1 004 scans this region to obtain a maximum 
value M, and computes the required number S of bits. 
[0016] In Fig. 1 6, since the maximum coefficient value 
M is "13", the number S of bits required for expressing 
this value is "4". Sixteen quantization indices in the se- 
quence are processed in units of four bit planes, as in- 
dicated by the right side in Fig. 16. 
[0017] The entropy encoder 1004 makes binary arith- 
metic coding of bits of the most significant bit plane (in- 
dicated by MSB in Fig. 16) first, and outputs the coding 
result as a bitstream. Then, the encoder 1004 lowers 
the bit plane by one level, and encodes and outputs bits 
of each bit plane to the code output unit 1005 until the 
bit plane of interest reaches the least significant bit 
plane (indicated by LSB in Fig. 16). At this time, a code 
of each quantization index is entropy-encoded immedi- 
ately after the first nonzero bit is detected upon scanning 
the bit plane. 

[0018] Parallel to laying down of the still image inter- 
national standards, MPEG-4 is being examined as a 
moving image coding scheme, and its international 
standardization is in progress. Conventional moving im- 
age coding represented by MPEG-2 encodes data in 
units of frames orfields, but MPEG-4 encodes using vid- 
eo and audio data as objects to implement re-use and 
editing of contents. Furthermore, an object contained in 
video data is also independently encoded, and can be 
processed as an object. Details of MPEG-4 are de- 
scribed in, e.g., "Outline of MPEG-4 International Stand- 
ards Determined", Nikkei Electronics, 1997.9.22 issue, 
p. 147-168, international standard IS01 4496-2, and the 
like. 

[0019] The standardization of MPEG-4 has ad- 
vanced, and an encoding technique of an image having 
an arbitrary shape or the like has been added. Also, a 
copyright protection mechanism of object data is under- 
going standardization to allow re-use of contents. Fur- 
thermore, standardization of a data description for data 
search (MPEG-7) is also underway. This standardiza- 
tion pertains to a description for appending meta infor- 
mation to facilitate a search. 

[0020] When meta information, copyright information, 
or the like is to be appended in JPEG2000, such infor- 
mation must be separately appended in addition to 
JPEG2000 encoded data, resulting in complicated man- 
agement and the like. 

[0021] Upon encoding in units of frames using 
JPEG2000 : audio data must be separately appended, 
resulting in a complicated sync process and data man- 
agement. 

SUMMARY OF THE INVENTION 

[0022] The present invention has been made in con- 
sideration of the aforementioned prior arts, and has as 



a concern to provide an image processing apparatus 
and method which can append required information 
while maintaining compatibility to conventional 
JPEG2000, and its computer program and storage me- 
5 dium. 

[0023] It is another concern of the present invention 
to provide an image processing apparatus and method 
which can convert object-encoded image data into ob- 
ject-encoded data while maintaining independence of 

10 objects, and its computer program and storage medium. 
[0024] It is still another concern of the present inven- 
tion to provide an image processing apparatus and 
method which can easily and reliably generate encoded 
data having an object structure in intra-frame coding, 

15 and its computer program and storage medium. 

[0025] A preferred embodiment of an image process- 
ing apparatus of the present invention comprising the 
structure as follows. 

[0026] An image processing apparatus comprises: 
20 image input means for inputting image data; information 
input means for inputting information data; region of in- 
terest setting means for setting a region of interest on 
the basis of the image data; transformation means for 
generating transform coefficients by computing fre- 
25 quency transforms of the image data; and control means 
for bit-shifting transform coefficients, which correspond 
to the region of interest, of the transform coefficients 
generated by said transformation means to upper bit 
planes, stuffing zeros in blank fields outside the region 
30 of interest, which are generated by the bit shift process, 
and stuffing the information data in blank fields within 
the region of interest, which are generated by the bit shift 
process. 

[0027] According to an image processing method of 
35 the present invention comprising the steps as follows. 
[0028] An image processing method comprises: an 
image input step of inputting image data; an information 
input step of inputting information data; a region of in- 
terest setting step of setting a region of interest on the 
40 basis of the image data; a transformation step of gener- 
ating transform coefficients by computing frequency 
transforms of the image data; and a control step of bit- 
shifting transform coefficients, which correspond to the 
region of interest, of the transform coefficients to upper 
45 bit planes, stuffing zeros in blank fields outside the re- 
gion of interest, which are generated by the bit shift proc- 
ess, and stuffing the information data in blank fields with- 
in the region of interest, which are generated by the bit 
shift process. 

so [0029] According to one aspect of the present inven- 
tion, a quantization step for quantizing transform coeffi- 
cients may be further comprised. In this way, the infor- 
mation volume can be effectively reduced. 
[0030] According to one aspect of the present inven- 

55 tion, the frequency transformation step executes dis- 
crete wavelet transformation. In this way, shape infor- 
mation can be reflected in the frequency domain. 
[0031] According to one aspect of the present inven 
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tion, information data to be appended is audio data. 
[0032] According to one aspect of the present inven- 
tion, information data to be appended is meta data that 
pertains to an image description. 
[0033] According to one aspect of the present inven- 
tion, information data to be appended is an Intellectual 
Property right information. 

[0034] According to one aspect of the present inven- 
tion, the method comprises the encoding step of decom- 
posing the output of the stuffing step into bit planes, and 
encoding the bit planes. In this way, the information vol- 
ume can be reduced. 

[0035] Another preferred embodiment of an image 
processing apparatus of the present invention compris- 
ing the structure as follows. 

[0036] An image processing apparatus comprises: 
shape information extraction means for extracting 
shape information of an object from image data; object 
lexlure information extraction means for extracting tex- 
ture information of the object from the image data; 

background texture information extraction means 
for extracting texture information of a background from 
the image data; first frequency transformation means for 
computing frequency transforms of the texture informa- 
tion of the object and the texture information of the back- 
ground on the basis of the shape information extracted 
by said shape information extraction means; second fre- 
quency transformation means for computing frequency 
transforms of the texture information of the background; 
stuffing means for stuffing zeros in a region outside a 
region of the object on the basis of an output from said 
first frequency transformation means, and the shape in- 
formation; and bit plane encoding means for decompos- 
ing an output from said stuffing means into bit planes 
and encoding the bit planes, and decomposing an out- 
put from said second frequency transformation means 
into bit planes and encoding the bit planes. 
[0037] According to one aspect of the present inven- 
tion, the first and second frequency transformation 
means execute discrete wavelet transformation. In this 
way, shape information can be reflected in the frequency 
domain. 

[0038] According to one aspect of the present inven- 
tion, the apparatus comprises shape information 
change means for changing shape information to ex- 
pand on the basis of that shape information and a fre- 
quency transformation scheme. In this way, a natural im- 
age can be reproduced around the edge of an object 
without any special process. 

[0039] Other features and advantages of the present 
invention will be apparent from the following descrip- 
tions taken in conjunction with the accompanying draw- 
ings, in which like reference characters designate the 
same or similar parts throughout the figures thereof. 

BRIEF DESCRIPTION OF THE DRAWINGS 

[0040] The accompanying drawings, which are incor- 



porated in and constitute a part of the specification, il- 
lustrate embodiments of the invention and, together with 
the descriptions, serve to explain the principle of the in- 
vention. 

5 

Fig. 1 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
first embodiment of the present invention; 
Fig. 2 is a view for explaining a bit plane composition 

10 process in an embodiment of the present invention; 
Fig. 3 is a view for explaining encoded data in an 
embodiment of the present invention; 
Fig. 4 is a block diagram showing the arrangement 
of an image processing apparatus according to the 

75 second embodiment of the present invention; 

Fig. 5 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
third embodiment of the present invention; 
Fig. 6 is a block diagram showing the arrangement 

20 of an image processing apparatus according to the 
fourth embodiment of the present invention; 
Fig. 7 is a view showing a bit plane composition 
process in an embodiment of the present invention; 
Fig. 8 is a view for explaining encoded data in an 

25 embodiment of the present invention; 

Fig. 9 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
fifth embodiment of the present invention; 
Fig. 1 0 is a block diagram showing the arrangement 

30 of an image processing apparatus according to the 
sixth embodiment of the present invention; 
Fig. 11 is a flow chart for explaining an image en- 
coding process according to the sixth embodiment 
of the present invention; 

35 Fig. 12 is a flow chart for explaining an image en- 
coding process according to the seventh embodi- 
ment of the present invention; 
Fig. 13 is a block diagram showing an outline of 
JPEG2000; 

40 Fig. 14 is a view for explaining the subband config- 
uration of discrete wavelet transformation; 
Figs. 15A to 15C are views for explaining an outline 
of an ROI process of JPEG2000; 
Fig. 16 is a view for explaining an outline of bit plane 

45 coding based on JPEG2000; 

Figs. 1 7A to 1 7C are views for explaining an outline 
of an image to be encoded; 

Fig. 18 is a view for explaining an outline of decod- 
ing of the ROI process of JPEG2000; 
50 Fig. 1 9 is a view for explaining an outline of a com- 
position process associated with an ROI in 
JPEG2000; 

Fig. 20 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
55 eighth embodiment of the present invention; 

Fig. 21 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
ninth embodiment of the present invention; 
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Fig. 22 is a flow chart for explaining an image de- 
coding process according to the 10th embodiment 
of the present invention; 

Fig. 23 is a flow chart briefly showing the flow of 
process until encoding; 

Fig. 24 is a block diagram showing the arrangement 
of an image processing apparatus according to the 
11th embodiment of the present invention; 
Figs. 25A to 25C are views for explaining bit plane 
states in an embodiment of the present invention; 
Fig. 26 is a flow chart for explaining an image en- 
coding process according to the 11th embodiment 
of the present invention; 

Fig. 27 is a view for explaining encoded data in an 

embodiment of the present invention; 

Fig. 28 is a block diagram showing the arrangement 

of an image processing apparatus according to the 

12th embodiment of the present invention; 

Fig. 29 is a flow chart showing a decoding process 

according to the 12th embodiment of the present 

invention: 

Tig 30 is a block diagram showing the arrangement 
cf m image processing apparatus according to the 
13th embodiment of the present invention; 
Fig. 3 1 is a block diagram showing the arrangement 
cf an image processing apparatus according to the 
14th embodiment of the present invention; 
Fig 32 is a flow chart for explaining an image en- 
coding process according to the 15th embodiment 
of the present invention; 

Figs. 33 and 34 are flow charts showing the process 
in step S606 in Fig. 32; 

Fig. 35 is a flow chart for explaining an image en- 
coding process according to the 17th embodiment 
of the present invention; and 
Figs. 36 and 37 are flow charts showing a decoding 
process in step S702 in Fig. 35. 

DESCRIPTION OF THE PREFERRED 
EMBODIMENTS 

[0041] Preferred embodiments of the present inven- 
tion will be described in detail hereinafter with reference 
to the accompanying drawings. 

[First Embodiment] 

[0042] Fig. 1 is a block diagram showing the arrange- 
ment of an image processi ng apparatus according to the 
first embodiment of the present invention. Note that this 
embodiment will explain a case wherein MPEG-4 en- 
coded data is input and encoded, and encoded data 
similar to JPEG2000 encoded data is output. 
[0043] Referring to Fig. 1, reference numeral 1 de- 
notes an MPEG-4 encoded data input unit for inputting 
MPEG-4 encoded data. Reference numeral 2 denotes 
a demultiplexerfor demultiplexing input MPEG-4 encod- 
ed data, and inputting demultiplexed data to respective 



units. Reference numeral 3 denotes a shape code de- 
coder for receiving and decoding shape encoded data 
of an object, which is encoded by MPEG-4 and is de- 
multiplexed by the demultiplexer 2. Reference numeral 
4 denotes a texture decoder for decoding the texture of 
an object demultiplexed by the demultiplexer 2. Refer- 
ence numeral 5 denotes a texture decoder for decoding 
the texture of encoded data of a background image de- 
multiplexed by the demultiplexer 2. Reference numeral 
6 denotes a shape information correction unit for cor- 
recting shape information decoded by the shape code 
decoder 3. Reference numeral 7 denotes a mask en- 
coder for encoding mask information indicating the 
shape and position of an ROI. Reference numerals 8 
and 9 denote discrete wavelet transformers for respec- 
tively computing the discrete wavelet transforms of input 
image data. Reference numerals 10 and 11 denote 
quantizers for receiving and quantizing transform coef- 
ficients computed by the discrete wavelet transformers 
8 and 9. Reference numeral 12 denotes a bit shift con- 
troller for controlling by determining the number of bits 
which form a bit plane and a bit plane composition meth- 
od on the basis of the quantization results of the quan- 
tizers 10 and 11. Reference numeral 13 denotes a bit 
plane composition unit for compositing bit planes in ac- 
cordance with an instruction from the bit shift controller 
12. Reference numeral 14 denotes an entropy encoder 
for encoding in units of bit planes. Reference numeral 
15 denotes a multiplexer for shaping outputs from the 
mask encoder 7, bit shift controller 12. and entropy en- 
coder 14 into encoded data according to the format of 
JPEG2000. Reference numeral 16 denotes a code out- 
put unit for outputting generated encoded data. 
[0044] The operation of the aforementioned arrange- 
ment will be explained below. 

[0045] The MPEG-4 encoded data input unit 1 inputs 
MPEG-4 encoded data consisting of one object and 
background image in a core profile. The input encoded 
data is input to the demultiplexer 2, and is demultiplexed 
into encoded data that pertains to a shape code of the 
object, encoded data that pertains to texture, and en- 
coded data that pertains to background texture. The en- 
coded data that pertains to the shape code of the object 
is input to the shape code decoder 3, the encoded data 
that pertains to texture of the object to the texture de- 
coder 4, and the encoded data that pertains to back- 
ground texture to the texture decoder 5. 
[0046] The shape code decoder 3 decodes binary in- 
formation that represents the object shape. In this em- 
bodiment, shape data shown in, e.g., Fig. 17B will be 
exemplified as such shape information. 
[0047] This shape information is decoded and input 
to the shape information correction unit 6. The shape 
information correction unit 6 enlarges a region to the out- 
side this shape in consideration of the number of taps 
of discrete wavelet transformation. That is, the unit 6 
corrects the shape information to that which includes the 
affected range of pixel values in the object by discrete 
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wavelet transformation. Such Information can be 
uniquely determined by the number of taps and the 
number of subbands of wavelet transformation. Since 
the corrected shape information serves as mask infor- 
mation of an ROI, it is input to the mask encoder 7, and 
is encoded according to the format of JPEG2000. 
[0048] The texture decoder 4 decodes the texture of 
the object. The texture decoder 5 decodes the texture 
of the background. The discrete wavelet transformer 8 
receives and transforms the outputs from the texture de- 
coders 4 or 5 in accordance with the shape information, 
i.e., receives the output from the texture decoder 4 for 
pixels which are determined based on the shape infor- 
mation decoded by the shape code decoder 3 that they 
fail within the object, and receives the output from the 
texture decoder 5 forthe region corrected and expanded 
by the shape information correction unit 6, and com- 
putes their discrete wavelet transforms. The discrete 
wavelet transformer 9 receives the background texture 
as the output from the texture decoder 5, and computes 
the discrete wavelet transforms. 

[0049] The quantizer 1 0 receives the output from the 
discrete wavelet transformer 8 and quantizes the output 
by predetermined quantization coefficients. Likewise, 
the quantizer 11 quantizes the output from the discrete 
wavelet transformer 9 by predetermined quantization 
coefficients. The quantization results of these quantiz- 
ers 1 0 and 1 1 are input to the bit shift controller 1 2 and 
bit plane composition unit 13. 

[0050] The bit shift controller 1 2 computes the number 
Bb of bits required for expressing quantization values of 
transform coefficients at positions of the background 
texture occluded by the object, and the number Bo of 
bits required for expressing quantization values of the 
texture of the object, and determines the number of bit 
planes and composition method for bit plane composi- 
tion. The controller 12 generates a signal for controlling 
the bit plane composition unit 13 in accordance with the 
determination results. For example, when the maximum 
value of the quantization result of the background tex- 
ture is equal to or smaller than "63" based on the output 
from the quantizer 10, and the maximum value of the 
quantization result of the background texture at the ob- 
ject position is equal to or smaller than "31 ", the number 
Bb of bits is "5". Also, when the maximum value of the 
quantization result of the texture of the object is equal 
to or smaller than "63 M based on the output from the 
quantizer 1 1 , the number Bo of bits is "6 M . Therefore, the 
number Bt of bit planes used in bit plane encoding is the 
sum of the numbers Bb and Bo of bits, i.e., 11 bits. 
[0051] In this way, the bit shift controller 12 controls 
to output the quantization result of the background tex- 
ture in the lower 6 bits for a region that does not overlap 
the object, and stuffs "0"s in the upper 5 bits on the basis 
of the shape information. As for an overlapping region, 
the controller 12 controls the bit plane composition unit 
13 to output the quantization result of the object in the 
upper 6 bits, and to composite the quantization result of 



the background texture in the lower 5 bits. Also, the con- 
troller 12 encodes the number Bt of bit planes, and the 
number Bo of bits of the object, and inputs them as a 
BITS code to the multiplexer 15. 
5 [0052] The bit plane composition unit 13 composites 
bit planes under the control of the bit shift controller 12. 
Fig. 2 shows this process. 

[0053] Referring to Fig. 2, the least significant bits of 
the object are composited in the lower 6th bits for a por- 

10 tion 200 where the object is present. The composition 
result is input to the entropy encoder 14. In Fig. 2, bits 
representing the background texture corresponding to 
a region outside the region of the object 200 are present 
in a region 201 . Reference numeral 202 denotes blank 

15 fields where "0" bits are stuffed; and numeral 203 de- 
notes an empty region after the object has undergone 
the bit shift process. 

[0054] A process until the bit data shown in Fig. 2 is 
generated will be briefly explained below. In order to en- 

20 code both the object and its background (including back- 
ground regions inside and outside the object), the object 
and background texture corresponding to the region 
outside the object region undergo frequency transfor- 
mation to generate first transform coefficients (the out- 

25 puts from the discrete wavelet transformer 8 and quan- 
tizer 1 0), and the background texture corresponding to 
a region inside the object image region undergoes fre- 
quency transformation to generate second transform 
coefficients (the outputs from the discrete wavelet trans- 

30 former 9 and quantizer 11). Of the first transform coeffi- 
cients, bits corresponding to the object region are bit- 
shifted to an upper bit plane, bits "0" are stuffed in blank 
fields (202 in Fig. 2) formed after the bit shift process, 
and the second transform coefficients corresponding to 

35 the region inside the object region are stuffed in blank 
fields (203 in Fig. 2) within the object region formed by 
the bit shift process. 

[0055] The entropy encoder 1 4 encodes bit planes in 
turn from the MSB side, and supplies the encoded re- 
40 suits to the multiplexer 15. The multiplexer 15 shapes 
the input data to encoded data according to the 
JPEG2000 format. 

[0056] The flow of the processes until encoding will 
be briefly explained below using Fig. 23. In step S301 , 

45 MPEG-4 encoded data is decoded to obtain the object 
and its background (including background regions in- 
side and outside the object). In step S302, the object 
and the background corresponding to a region outside 
the object region undergo frequency transformation to 

so generate first transform coefficients. In step S303, the 
background texture corresponding to the region inside 
the object image undergoes frequency transformation 
to generate second transform coefficients. Note that the 
processing order of steps S302 and S303 is not partic- 

55 ularly limited as long as both of them can be done (two 
transformation processes may be sequentially done by 
a single transformer/two transformation processes may 
be parallelly done by two transformers). In step S304, 
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bits corresponding to the object region of the first trans- 
form coefficients are bit-shifted to an upper bit plane, 
bits "0" are stuffed in blank fields (202 in Fig. 2) formed 
after the bit shift process, and the second transform co- 
efficients corresponding to the region inside the object 
region are stuffed in blank fields (203 in Fig. 2) within 
the object region formed by the bit shift process. Finally, 
in step S305, the obtained bit data shown in Fig. 2 is 
entropy-encoded in turn from upper bit planes. 
[0057] Fig. 3 shows an output example of encoded 
data obtained by the aforementioned encoding process. 
[0058] In Fig. 3, a header including a code which in- 
dicates information of the size of the encoded image or 
the like is followed by a BITS code as the encoding result 
of the number Bt of bit planes and the number Bo of bits 
of the object. Then, the encoding result of mask infor- 
mation output from the mask encoder 7 follows. Further- 
more, a SHIFT code indicating the presence of the back- 
ground texture in the lower bits of the object follows. Fi- 
nally, the entropy encoding result (data) appears. The 
entropy encoding result is separated into subbands (LL 
to HH1), each of which consists of encoded data for 11 
bit planes. The multiplexed encoded data is externally 
output via the code output unit 16. 
[0059] With a series of operations, encoded data, 
which preserves background image data lost by stuffing 
"0"s in the conventional process, can be generated. 
Since bit plane composition is done by detecting the 
number of bits required for a portion that overlaps the 
object, the coding efficiency can be improved by reduc- 
ing the number of bit planes. 

[0060] In this embodiment, MPEG-4 encoded data is 
input, and JPEG2000 encoded data is output. However, 
the present invention is not limited to such specific data. 
[0061] In this embodiment, quantizers are provided to 
improve coding efficiency. However, the quantizers may 
be omitted to obtain reversible codes free from any de- 
terioration. 

[Second Embodiment] 

[0062] Fig. 4 is a block diagram showing the arrange- 
ment of an image processing apparatus according to the 
second embodiment of the present invention. Note that 
the same reference numerals denote the same building 
components as those in the first embodiment, and a de- 
tailed description thereof will be omitted. The second 
embodiment will exemplify a case wherein image data 
sensed by cameras 31 and 32 are input, and are encod- 
ed and output. 

[0063] Referring to Fig. 4, reference numerals 31 and 
32 denote cameras for sensing an image and generat- 
ing video signals. Reference numeral 33 denotes an ob- 
ject extraction unit for extracting an object from the cap- 
tured video signal in accordance with a known algo- 
rithm. For example, extraction is attained by, e.g., chro- 
ma-key. Reference numeral 34 denotes a frame mem- 
ory for holding image data captured by the camera 32. 



[0064] Image data captured by the camera 31 is input 
to the object extraction unit 33 in units of frames. The 
object extraction unit 33 cuts out an object, extracts its 
shape as binary mask information, and outputs the cut- 

5 out image data as texture data of the object. 

[0065] On the other hand, the camera 32 captures 
background image data, and stores the image data in 
the frame memory 34 so as to execute a process in syn- 
chronism with the object extraction unit 33. 

10 [0066] In the second embodiment, subsequent proc- 
esses are the same as those in the first embodiment. 
That is, the shape information correction unit 6 receives 
the mask information from the object extraction unit 33, 
and corrects the mask information by expanding its 

15 edge. The correction result is encoded by the mask en- 
coder 7, and is input to the multiplexer 15. The discrete 
wavelet transformer 8 stuffs "0 M s in a region outside the 
object, and reads out the corresponding image data 
from the frame memory 34 for the expanded portion, in 

20 accordance with the shape information corrected by the 
shape information correction unit 6. Furthermore, the 
discrete wavelet transformer 8 selects the output from 
the object extraction unit 33 for a region inside the ob- 
ject, and computes the discrete wavelet transforms. 

25 [0067] At the same time, the discrete wavelet trans- 
former 9 computes the discrete wavelet transforms of 
the background image. The quantizers 10 and 11 re- 
ceive and quantize the wavelet transform coefficients 
output from these discrete wavelet transformers 8 and 

30 9. The bit shift controller 12 determines the bit distribu- 
tion between the object and background upon compo- 
sition on the basis of the mask information from the 
shape information correction unit 6 and the quantization 
results of the quantizers 10 and 11 , and controls the bit 

35 plane composition unit 13. At the same time, the con- 
troller 12 encodes required information. The bit plane 
composition unit 13 generates 1 1 -bit bit planes as in the 
first embodiment. The entropy encoder 14 encodes 
these bit planes and outputs the encoded data to the 

40 multiplexer 15. The multiplexer 15 shapes the encoded 
data in accordance with the JPEG2000 format, and ex- 
ternally outputs encoded data via the code output unit 
16. 

[0068] As described above, according to the second 
45 embodiment, encoded data which can independently 
process an object can be generated on the basis of the 
captured image data. 

[0069] In the second embodiment, quantizers are pro- 
vided to improve coding efficiency. However, the quan- 
50 tizers may be omitted to obtain reversible codes free 
from any deterioration. 

(Third Embodiment] 

55 [0070] Fig. 5 is a block diagram showing the arrange- 
ment of an image processing apparatus according to the 
third embodiment of the present invention. The third em- 
bodiment will explain a case wherein JPEG2000 encod- 
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ed data generated in the first embodiment is input, and 
MPEG-4 encoded data is output. 
[0071] Referring to Fig. 5, reference numeral51 de- 
notes a code input unit for receiving JPEG2000 encoded 
data generated according to the first embodiment. Ref- 
erence numerai 52 denotes a demultiplexer for demul- 
tiplexing the input encoded data, and inputting demulti- 
plexed data to respective units. Reference numeral 53 
denotes a flag discrimination unit for decoding and dis- 
criminating a SHIFT code of encoded data. Reference 
numeral 54 denotes a mask decoder for decoding mask 
information that represents the shape and position of an 
ROI, and a BITS code which indicates the number of 
bits of the whole image and the number of bits of the 
ROI portion. Reference numeral 55 denotes a shape in- 
formation correction unit for correcting shape informa- 
tion. Reference numeral 56 denotes a shape informa- 
tion encoder for encoding shape information by MPEG- 
4 Reference numeral 57 denotes an entropy decoder 
fo- decoding in units of bit planes. Reference numeral 
G6 denotes a bit plane decomposition unit for decom- 
posing encoded data into bit plane data of an object por- 
tion nnd those of a background portion, and outputting 
them to dcquantizcrs 59 and 60, respectively. The de- 
qLantizers 59 and 60 execute dequantization of the 
aforementioned quantizers 10 and 11. Reference nu- 
merals 61 and 62 denote inverse discrete wavelet trans- 
formers which execute inverse discrete wavelet trans- 
formation of the aforementioned discrete wavelet trans- 
formers 8 and 9. Reference numeral 63 denotes an ob- 
ject shaping unit for shaping image data of an object in 
accordance with shape information corrected by the 
shape information correction unit 55. Reference numer- 
als 64 and 65 denote texture encoders for respectively 
texture-encoding the object and background portions by 
MPEG-4. Reference numeral 66 denotes a multiplexer 
for forming encoded data based on the outputs from the 
shape information encoder 56 and texture encoders 64 
and 65 in accordance with the MPEG-4 format. Refer- 
ence numeral 67 denotes an MPEG-4 encoded data 
output unit for outputting the generated MPEG-4 encod- 
ed data. 

[0072] In such arrangement, the code input unit 51 re- 
ceives encoded data generated by the first embodiment 
mentioned above. The input encoded data is input to the 
demultiplexer 52 to decode a header, thus acquiring re- 
quired information and inputting such information to re- 
spective units. Furthermore, encoded data of a BITS 
code and mask information are input to the mask de- 
coder 54, a SHIFT code to the flag discrimination unit 
53, and the remaining data to the entropy decoder 57. 
[0073] The flag discrimination unit 53 decodes the 
SHIFT code to discriminate if information of the back- 
ground image is present in lower bits of the ROI portion. 
If it is determined that no background image information 
is present, a normal ROI process in JPEG2000 coding 
is done. On the other hand, if it is determined that the 
background image is present, that background image 



data is reconstructed. 

[0074] A case will be explained first wherein the back- 
ground image is present. 

[0075] The mask decoder 54 decodes the mask infor- 

5 mation indicating the ROI shape and position, and the 
BITS code which indicate the number of bits of the whole 
image and the number of bits of the ROI portion. Note 
that the ROI portion represents an object. Since the re- 
gion of the mask information has been expanded to out- 

10 side the object shape by the shape information correc- 
tion unit 6 in the first embodiment described above in 
consideration of the number of taps of discrete wavelet 
transformation, the shape information correction unit 55 
executes an inverse process. More specifically, the 

15 shape information correction unit 55 corrects shape in- 
formation to that which does not include the range influ- 
enced by pixel values within the object by discrete wave- 
let transformation. The corrected shape information is 
input to the shape information encoder 56, and is en- 

20 coded according to MPEG-4 shape information coding. 
[0076] On the other hand, the entropy decoder 57 de- 
codes bit planes in turn from the MSB side, and inputs 
the decoding results to the bit plane decomposition unit 
58. The bit plane decomposition unit 58 receives data 

25 of the bit planes shown in Fig. 2. In Fig. 2, texture data 
of the object 200 is decomposed in accordance with the 
shape information decoded by the mask decoder 54, 
and the number Bo of bits of the object, and is input to 
the dequantizer 59. Also, "0 M s are stuffed in a portion of 

30 texture data of the background image 201 in Fig. 2, 
where the least significant bits of the object are com- 
posed, and that texture data is input to the dequantizer 
60. 

[0077] The dequantizers 59 and 60 respectively exe- 

35 cute dequantization of the quantizers 10 and 11, and 
their dequantization results are respectively input to the 
inverse discrete wavelet transformers 61 and 62. The 
inverse discrete wavelet transformers 61 and 62 com- 
pute the inverse discrete wavelet transforms of the in- 

40 puts, thus reconstructing texture data. 

[0078] The output from the inverse discrete wavelet 
transformer 61 is input to the object shaping unit 63, 
which receives original shape information of the object 
as the output from the shape information correction unit 

45 55, and replaces the background portion, which is de- 
termined to be a region outside the object on the basis 
of that shape information, by "0"s. The texture encoder 
64 encodes texture data of the object shaped by the ob- 
ject shaping unit 63 by MPEG-4 texture coding. The tex- 

50 ture encoder 65 also encodes texture data of the back- 
ground by MPEG-4 texture coding. 
[0079] The multiplexer 66 shapes input data to encod- 
ed data according to the MPEG-4 core profile format. 
The shaped encoded data is externally output via the 

55 MPEG-4 encoded data output unit 67 as MPEG-4 en- 
coded data containing one object and background im- 
age in a core profile. 

[0080] A case will be explained below wherein the flag 
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discrimination unit 53 determines that no background 
image is present. 

[0081 ] In this case, the flag discrimination unit 53 con- 
trols not to operate the shape information correction unit 
55, shape information encoder 56, dequantizer 59, in- 
verse discrete wavelet transformer 61, object shaping 
unit 63, and texture encoder 64. Also, the bit plane de- 
composition unit 58 is controlled to execute a normal 
ROI process of JPEG2000. 

[0082] The mask decoder 54 decodes the mask infor- 
mation indicating the ROI shape and position, and the 
BITS code which indicate the number of bits of the whole 
image and the number of bits of the ROI portion. The 
entropy decoder 57 decodes bit planes in turn from the 
MSB side, and supplies the decoding results to the bit 
plane decomposition unit 58. The bit plane decomposi- 
tion unit 58 receives data of the bit planes like those 
shown in Fig. 19. 

[0083] Referring to Fig. 19, texture data of the object 
200 is demultiplexed in accordance with the shape in- 
formation decoded by the mask decoder 54 and the 
number Bo of bits of the object, is shifted to lower bit 
planes, and is then input to the dequantizer 60. At this 
timc r the bit plane data have the bit plane configuration 
shown in Fig. 1 8. 

[0084] The dequantizer 60 dequantizes the input da- 
ta, and the inverse discrete wavelet transformer 62 com- 
putes the inverse discrete wavelet transforms, thus re- 
constructing the texture data of the object. The texture 
encoder 65 encodes the texture data of the object in ac- 
cordance with MPEG-4 texture coding in the same man- 
ner as the background texture data. 
[0085] The multiplexer 66 shapes input data to encod- 
ed data according to an MPEG-4 simple profile format. 
That is, the encoded data in which the object is shaped 
as encoded data of a rectangular image is output from 
the MPEG-4 encoded data output unit 67 as MPEG-4 
encoded data containing one object. 
[0086] With a series of operations mentioned above, 
encoded data which holds both object and background 
image data can be converted into object encoded data 
while maintaining compatibility to the conventional 
JPEG2000 encoded data. 

[0087] In the third embodiment, JPEG2000 encoded 
data is input, and M PEG-4 encoded data is output. How- 
ever, the present invention is not limited to those specific 
data. 

[0088] In the third embodiment, quantizers are provid- 
ed to improve coding efficiency. However, the quantizers 
may be omitted to obtain reversible codes free from any 
deterioration. 

[Fourth Embodiment] 

[0089] Fig. 6 is a block diagram showing the arrange- 
ment of an image processing apparatus according to the 
fourth embodiment of the present invention. Note that 
the same reference numerals denote the same building 
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components as those in the first embodiment (Fig. 1) 
described above, and a detailed description thereof will 
be omitted. 

[0090] Referring to Fig. 6, reference numeral 101 de- 

5 notes a quantization value processor for partially chang- 
ing the quantization result. Reference numeral 102 de- 
notes a bit plane composition unit; and numeral 1 03 de- 
notes an entropy encoder. As in the first embodiment, 
the MPEG-4 data input unit 1 inputs MPEG-4 encoded 

10 data containing one object and background image in a 
core profile. The input encoded data is supplied to the 
demultiplexer, and is demultiplexed into encoded data 
that pertains to a shape code of the object, encoded data 
that pertains to texture, and encoded data that pertains 

*5 to the background texture. The encoded data that per- 
tains to the shape code of the object is supplied to the 
shape code decoder 3, the encoded data that pertains 
to the texture of the object to the texture decoder 4, and 
the encoded data that pertains to the background tex- 

20 ture to the texture decoder 5. 

[0091] The shape code decoder 3 decodes binary in- 
formation that represents the object shape, and inputs 
it to the shape information correction unit 6. The shape 
information correction unit 6 enlarges a region to the out- 

25 side the object shape in consideration of the number of 
taps of discrete wavelet transformation as in the first em- 
bodiment. The texture decoder 4 decodes the texture of 
the object. The texture decoder 5 decodes the texture 
of the background. The discrete wavelet transformer 8 

30 receives the output from the texture decoder 4 for pixels 
which are determined based on the shape information 
decoded by the shape code decoder 3 that they fall with- 
in the object, receives the output from the texture de- 
coder 5 for the region corrected and expanded by the 

35 shape information correction unit 6, and computes their 
discrete wavelet transforms. 

[0092] The quantizer 1 0 quantizes the output from the 
discrete wavelet transformer 8 by predetermined quan- 
tization coefficients, Likewise, the quantizer 11 quantiz- 

40 es the output from the discrete wavelet transformer 9 by 
predetermined quantization coefficients. The quantiza- 
tion result of the quantizer 1 0 is sent to the quantization 
value processor 101, and the quantization result of the 
quantizer 1 1 is sent to the bit plane composition unit 1 02. 

45 [0093] The quantization value processor 1 01 corrects 
the quantization result input from the quantizer 10 in ac- 
cordance with the shape information supplied from the 
shape information correction unit 6. In this case, the 
processor 1 01 replaces a quantization value "O" by "1", 

50 so that all quantization values in the object become 
nonzero. The result is input to the bit plane composition 
unit 102. 

[0094] The bit plane composition unit 1 02 composites 
bit planes under the control of the shape information cor- 
55 rection unit 6. Fig. 7 shows this process. 

[0095] Referring to Fig. 7, a given portion 700 of the 
object is stored from the MSB to the 8th bit. At this time, 
"0"s are stuffed in a portion 701 . A portion associated 
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with the background is stored from the 7th bit to the LSB 
(Oth bit), and the object 700 and a background image 
702 are composed without being mixed in bit planes. 
The composition result is input to the entropy encoder 
1 03. The entropy encoder 1 03 generates codes accord- 
ing to the JPEG2000 format, and outputs them to the 
code output unit 16. 

[0096] Fig. 8 shows a generation example of the data. 
[0097] Fig. 8 shows the data format of JPEG2000 en- 
coded data. 

[0098] Referring to Fig. 8, reference numeral 801 de- 
notes a header containing a code which indicates infor- 
mation of the size of the encoded image or the like. Ref- 
erence numeral 802 denotes a BITS code as the encod- 
ing result of the number of bit planes. Reference numer- 
al 803 denotes data that stores the entropy encoding 
result of each bit plane. The entropy encoding result is 
separated into bit planes, each of which consists of en- 
coded data for respective subbands. The generated en- 
coded data is externally output via the code output unit 
16. 

[0099] With a series of operations, encoded data, 
which preserves background image data lost by stuffing 
"0"s in the conventional process, can be generated. 
Since the shape of the object can be discriminated by 
checking if upper bits are "0"s or u nonzero"s, the coding 
efficiency can be improved without encoding the shape 
information of the object. 

[0100] In the fourth embodiment, MPEG-4 encoded 
data is input, and JPEG2000 encoded data is output. 
However, the present invention is not limited to such 
specific data. 

[01 01 ] In the fourth embodiment, the quantization val- 
ue processor 101 replaces a value "0" by a minimum 
value "1". However, the present invention is not limited 
to this, and the value "0" may be replaced by a quanti- 
zation value which never appears. In this case, replaced 
values are also encoded and sent, and the decoder re- 
places the substituted values by "0"s, thus preventing 
information from deteriorating. 

[0102] Furthermore, in the fourth embodiment, the 
quantizers 10 and 11 are provided to improve coding 
efficiency. However, the quantizers may be omitted to 
obtain reversible codes free from any deterioration. 

[Fifth Embodiment] 

[0103] Fig. 9 is a block diagram showing the arrange- 
ment of an image processing apparatus according to the 
fifth embodiment of the present invention. Note that the 
same reference numerals denote the same building 
components as those in the third embodiment (Fig. 5), 
and a detailed description thereof will be omitted. 
[0104] Referring to Fig. 9, reference numeral 151 de- 
notes an entropy decoder for decoding JPEG2000 en- 
coded data. Reference numeral 152 denotes a bit plane 
decomposition unit for decomposing data associated 
with an object in upper bits, and data associated with 
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the background in lower bits. Reference numeral 153 
denotes a shape extraction unit for extracting the shape 
of the object from the data associated with the object. 
Reference numeral 154 denotes a quantization value 
5 processor for replacing quantization values. 

[0105] The fifth embodiment will explain a case 
wherein JPEG2000 encoded data generated in the 
fourth embodiment is input, and MPEG-4 encoded data 
is output. 

10 [0106] As in the third embodiment described above 
with reference to Fig. 5, the code input unit 51 receives 
encoded data generated by the fourth embodiment 
mentioned above. The input encoded data is sent to the 
entropy decoder 151. The entropy decoder 151 de- 

75 codes the header 801 (see Fig. 8) to acquire required 
information, and inputs the acquired information to re- 
spective units. Furthermore, the entropy decoder 151 
decodes the BITS code 802 (see Fig. 8), and inputs in- 
formation to the respective units. Moreover, the entropy 

20 decoder 151 decodes the data field 803 (see Fig. 8) in 
units of bit planes in turn from the MSB side. Note that 
the decoding result of the BITS code reveals that the 
upper half bit planes store the data that pertains to the 
object, and the lower half bit planes store the data that 

25 pertains to the background. Therefore, the bit plane de- 
composition unit 1 52 supplies the upper bit planes to the 
shape extraction unit 153 and quantization value proc- 
essor 154, and the lower bit planes to the dequantizer 
60. 

30 [0107] The shape extraction unit 153 discriminates 
each quantization value of the input bit planes. If the 
quantization value is "0", the unit 153 determines a re- 
gion outside the object; if the quantization value is 
"nonzero", it determines a region inside the object, and 
35 generates binary shape information using these dis- 
crimination results. The generated shape information is 
input to the shape information correction unit 55. The 
shape information correction unit 55 corrects the shape 
information to that which represents the object shape, 
40 since the number of taps of discrete wavelet transfor- 
mation is known, as in the third embodiment. The cor- 
rected shape information is supplied to the shape infor- 
mation encoder 56 and object shaping unit 63. The 
shape information encoder 56 encodes the shape infor- 
ms mation according to MPEG-4 shape information coding, 
and supplies encoded data to the multiplexer 66, as in 
the third embodiment. 

[01 08] On the other hand, the quantization value proc- 
essor 154 replaces all input quantization values "1" by 

so "0", and outputs them to the dequantizer 59. After that, 
the dequantizer 59 dequantizes the inputand supplies 
to the inverse discrete wavelet transformer 61 , and the 
inverse discrete wavelet transformer 61 computes the 
inverse discrete wavelet transforms, thus reconstructing 

55 texture data, as in the third embodiment. The recon- 
structed texture data is supplied to the object shaping 
unit 63, which replaces a portion, which is determined 
to be a region outside the object based on the shape 
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information corrected by the shape information correc- 
tion unit 55, by "0". The texture encoder 64 encodes the 
shaped texture data of the object by MPEG-4 texture 
coding. 

[0109] The lower bit planes are dequantized by the 
dequantizer 60, and undergo inverse discrete wavelet 
transformation by the inverse discrete wavelet trans- 
former 62, thus reconstructing texture data, as in the 
third embodiment The background texture data is input 
to the texture encoder 65, and is encoded by MPEG-4 
texture coding. 

[0110] The multiplexer 66 receives encoded data 
from the shape information encoder 56 and texture en- 
coders 64 and 65, and shapes these data to encoded 
data according to the MPEG-4 core profile format. The 
shaped encoded data is externally output via the MPEG- 
4 encoded data output unit 67 as MPEG-4 encoded data 
containing one object and background image in a core 
profile. 

[0111] With a series of operations mentioned above, 
encoded data which holds both object and background 
image data can be converted into object encoded data 
whiie maintaining compatibility to the conventional 
JPEG2000 encoded data. Also, since the shape infor- 
mation of the object is reconstructed from the quantiza- 
tion values, it need not be sent, and deterioration of im- 
age quality upon replacing quantization values can be 
minimized since the quantization values are replaced by 
minimum values. 

[0112] In the fifth embodiment, JPEG2000 encoded 
data is input, and M PEG-4 encoded data is output. How- 
ever, the present invention is not limited to those specific 

data. 

[0113] In the fifth embodiment, the quantization value 
processor 1 54 replaces "0 M by another value, and when 
the replaced value is encoded and sent, information can 
be prevented from deteriorating by replacing the re- 
placed value by "0". 

[0114] Furthermore, in the fifth embodiment, quantiz- 
ers are provided to improve coding efficiency. However, 
the quantizers may be omitted to obtain reversible 
codes free from any deterioration. 

[Sixth Embodiment] 

[0115] Fig. 10 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the sixth embodiment of the present invention. 
[0116] Referring to Fig. 10, reference numeral 500 de- 
notes a central processing unit (CPU) for controlling the 
entire apparatus and executing various processes; and 
numeral 501 denotes a memory which stores an oper- 
ating system (OS) and software required for controlling 
the apparatus of this embodiment, and provides storage 
areas required for arithmetic operations. Reference nu- 
meral 502 denotes a bus for connecting respective 
units : various controllers, and various devices to ex- 
change data, control signals and the like; numeral 503 



denotes a storage un it for storing software; numeral 504 
denotes a storage unit for storing moving image data; 
numeral 505 denotes a monitor (display) for displaying 
an image, message, and the like; and numeral 508 de- 
5 notes a communication line which comprises a LAN, 
public line, radio line, broadcast wave, or the like. Ref- 
erence numeral 507 denotes a communication interface 
for sending encoded data onto the communication line 
508. Reference numeral 506 denotes a terminal which 
10 is used to start up the apparatus, and to set various con- 
ditions such as a bit rate, and the like. 
[0117] The memory 501 has an area which stores the 
OS that controls the overall apparatus and makes vari- 
ous kinds of software run, and software to run, and an 
15 image area which temporally loads image data to be en- 
coded, a code area which temporarily stores code data, 
and a working area which stores parameters of various 
arithmetic operations and the like. 
[01 18] In this arrangement; prior to a process, the us- 
20 er selects moving image data to be encoded from those 
stored in the storage unit 504 and instructs to start up 
the apparatus at the terminal 506. In response to this 
instruction, software stored in the storage unit 503 is 
mapped on the memory 501 via the bus 502 and is 
25 launched, thus starting the process. 

[0119] The operation for converting MPEG-4 encod- 
ed data stored in the storage unit 504 into JPEG2000 
encoded data in units of frames by the CPU 500 will be 
described below with reference to the flow chart shown 
30 in Fig. 11 . Note that this MPEG-4 encoded data is core 
profile data, and contains a background and one object. 
[0120] In step S1 , MPEG-4 encoded data selected at 
the terminal 506 is read out from the storage unit 504, 
and is stored in the code area of the memory 501 . The 
35 flow advances to step S2 to read and decode encoded 
data, which pertains to shape information of the object, 
of the MPEG-4 encoded data, so as to generate a binary 
image that represents the object shape. The binary im- 
age is stored in the image area of the memory 501 . The 
40 flow advances to step S3 in which an expanded region 
for expanding the shape information of the object is 
computed from the number of taps of discrete wavelet 
transformation used later. In this case, the vertical and 
horizontal sizes of the expanded region of that object 
45 can be uniquely determined based on the number of 
taps and the number of subbands. A binary image that 
represents the expanded region and the remaining re- 
gion is generated, and the flow advances to step S4. 
[0121] In step S4, a mask that represents the shape 
so of an ROI of JPEG2000 coding is encoded on the basis 
of a header as encoded data which pertains to the char- 
acteristics of an image of the JPEG2000 encoded data 
to be generated, and the shape information and expand- 
ed region information stored in the image area of the 
55 memory 501 , and is stored in the code area of the mem- 
ory 501 . The flow advances to step S5 to read out and 
decode encoded data, which pertains to the texture of 
the object, from the MPEG-4 encoded data stored in the 
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code area of the memory 501 , and to store image data 
generated by decoding in the image area of the memory 
501 . The flow advances to step S6 to read out and de- 
code encoded data, which pertains to the background 
texture, from the MPEG-4 encoded data stored in the 
code area of the memory 501, and to store the gener- 
ated image data in the image area of the memory 501 . 
[0122] The flow advances to step S7 to compute the 
discrete wavelet transforms of pixels, which are deter- 
mined to fall within the object based on the shape infor- 
mation of the object generated in step S2, as texture 
data of the object, pixels, which belong to the expanded 
region generated in step S3, as texture data of the back- 
ground, and other pixels as "0". The computation result 
is stored in the working area of the memory 501. The 
flow then advances to step S8 to quantize the object 
transformation result stored in the working area of the 
memory 501 in accordance with predetermined quanti- 
zation coefficient. 

[0123] The flow advances to step S9 to encode the 
quantization result of the object stored in the working 
area of the memory 501 in step S8 in turn from a bit 
plane on the MSB side, and to store the encoding result 
after the code that pertains to the mask in the code area 
of the memory 501 . The flow advances to step S10 to 
compute the discrete wavelet transforms of the back- 
ground texture data : and to store the result in the work- 
ing area of the memory 501 . The flow then advances to 
step S11 . In step S11, the background transformation 
result stored in the working area is quantized in accord- 
ance with predetermined quantization coefficients. The 
flow advances to step S12 to encode the background 
quantization result stored in the working area of the 
memory 501 in step S11 in turn from a bit plane on the 
MSB side, and to store the encoding result after the code 
that pertains to the texture of the object stored in the 
code area of the memory 501 . The JPEG2000 encoded 
data generated in the code area of the memory 501 in 
this way is stored at a predetermined location in the stor- 
age unit 504. Upon completion of the process in step 
S 1 2, the encoding process of the frame of interest ends, 
and the next frame is processed or the process ends. 
[0124] With a series of operations mentioned above, 
encoded data which holds both object and background 
image data can be converted into object encoded data 
while maintaining compatibility to the conventional 
JPEG2000 encoded data. 

[0125] In the sixth embodiment, JPEG2000 encoded 
data is input, and MPEG-4 encoded data is output. How- 
ever, the present invention is not limited to those specific 
data. 

[Seventh Embodiment] 

[0126] As the seventh embodiment of the present in- 
vention, the operation for converting JPEG2000 encod- 
ed data in units of frames, which are generated in the 
sixth embodiment mentioned above using the arrange- 
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ment of the image processing apparatus shown in Fig. 
10 and are stored in the storage unit 504, into MPEG-4 
encoded data will be explained below with reference to 
the flow chart shown in Fig. 12. 

5 [0127] In step S1 01 , JPEG2000 encoded data select- 
ed at the terminal 506 is read out from the storage unit 
504, and is stored in the code area of the memory 501 . 
The header and mask information of the JPEG2000 en- 
coded data are decoded, and the decoded mask infor- 

10 mation is stored in the image area of the memory 501 . 
The flow advances to step S102 to read out encoded 
data of bit planes, which correspond to the ROI, of the 
JPEG2000 encoded data stored in the code area of the 
memory 501 , to decode that encoded data, and to store 

15 the decoded data in the image area of the memory 501 . 
The stored data is the quantization result of the texture 
data of the object. 

[0128] The flow advances to step S 103 to compute 
the expanded region expanded in step S3 in Fig. 11 on 
20 the basis of the mask information stored in the image 
area of the memory 501 and discrete wavelet transfor- 
mation used upon encoding, and to store the region in 
the image area of the memory 501 as a binary image. 
The flow advances to step S1 04 to correct the mask in- 
25 formation obtained by decoding in step S1 01 by remov- 
ing the expanded region computed in step S103 from 
that mask information, thus generating the shape infor- 
mation of the object. The shape information is encoded 
and stored in the code area of the memory 501 . The flow 
30 advances to step S105 to dequantize the quantization 
result of the object texture stored in the image area of 
the memory 501 in step S103, and to store the dequan- 
tization result in the image area of the memory 501 . The 
flow advances to step S 106 to generate image data by 
35 computing the inverse wavelet transforms of the de- 
quantization result of the object texture generated in 
step S105, and to store that image data in the image 
area of the memory 501. The flow advances to step 
S1 07 to replace pixel data corresponding to the expand- 
40 ed area of the object computed in step S1 03 by "0", and 
to store them in the image area of the memory 501 . 
[0129] The flow advances to step S108 to generate 
encoded data by texture-encoding the image data of the 
object stored in step S1 07 by MPEG-4, and to store the 
45 encoded data after the shape information encoded data 
in the code area of the memory 501 . Since the shape 
information encoded data and texture encoded data are 
MPEG-4 encoded data of the object, they are stored at 
a predetermined location in the storage unit 504. 
so [01 30] The flow advances to step S 1 09 to decode low- 
er bit planes which are stored in the code area of the 
memory 501 and remain undecoded, and to store the 
decoded data in the image area of the memory 501 . The 
flow advances to step S1 1 0 . The stored data is the quan- 
55 tization result of the background texture. In step S1 1 0, 
the quantization result of the background texture stored 
in the image area of the memory 501 in step S109 is 
dequantized, and the dequantization result is stored in 
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the image area of the memory 501 . The flow advances 
to step S111 to generate image data by computing the 
inverse discrete wavelet transforms of the dequantiza- 
tion result of the background texture generated in step 
S110, and to store the image data in the image area of 
the memory 501. The flow advances to step S112 to 
generate encoded data by encoding the background im- 
age data stored in step S 1 11 by MPEG-4 texture coding, 
and to save the encoded data at a predetermined loca- 
tion of the storage unit 504 as encoded data of the tex- 
ture of the background image. The flow then advances 
to step S113 to output the stored data as MPEG-4 en- 
coded data. 

[0131] With a series of operations mentioned above, 
encoded data which holds both object and background 
image data can be converted into object encoded data 
while maintaining compatibility to the conventional 
JPEG2000 encoded data. 

[0132] In the seventh embodiment, JPEG2000 en- 
coded data is input, and MPEG-4 encoded data is out- 
put. However, the present invention is not limited to 
those specific data. 

[0133] In the seventh embodiment, MPEG-4 encod- 
ing in units of frames has been exemplified, but motion 
compensation may be done. 

[0134] Furthermore, the background image and ob- 
ject image may be composed in accordance with the 
shape information, and the composite image may be 
displayed on the monitor 506, stored in the storage unit 
504, or output onto the communication line 508 via the 
communication interface 507. 

[Eighth Embodiment] 

[0135] Fig. 20 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the eighth embodiment of the present invention. 
[0136] In Fig. 20, the shape information encoder 56 
and texture encoders 64 and 65 in Fig. 5 are replaced 
by a shape information output unit 856 and texture out- 
put units 864 and 865, respectively, and the multiplexer 
66 and MPEG-4 encoded data output unit 67 are omit- 
ted. Note that the same reference numerals denote the 
same building components as those in the third embod- 
iment (Fig. 5) mentioned above, and a detailed descrip- 
tion thereof will be omitted. 

[0137] Referring to Fig. 20, reference numeral 856 de- 
notes a shape information output unit for outputting gen- 
erated shape information. Reference numeral 864 de- 
notes a texture output unit for outputting generated im- 
age data of the object. Reference numeral 865 denotes 
a texture output unit for outputting generated image data 
of the background. 

[0138] The eighth embodiment will explain a case 
wherein JPEG2000 encoded data generated by the first 
embodiment described above is input and reconstruct- 
ed. 

[0139] The code input unit 51 receives encoded data 
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generated by the aforementioned first embodiment, as 
in the third embodiment described previously with refer- 
ence to Fig. 5. The input encoded data is input to the 
demultiplexer 52 to decode a header, and respective en- 

5 coded data are input to the flag discrimination unit 53, 
mask decoder 54, and entropy decoder 57. The flag dis- 
crimination unit 53 checks the presence/absence of the 
background, and the mask decoder 54 decodes mask 
information as in the third embodiment. The decoded 

10 mask information is corrected by the shape information 
correction unit 55, and is supplied to the object shaping 
unit 63. Also, the mask information is externally output 
via the shape information output unit 856. 
[0140] The entropy decoder 57 decodes respective 

75 bit planes, and the bit plane decomposition unit 58 de- 
composes and outputs bit plane data to the dequantiz- 
ers 59 and 60 in accordance with an instruction from the 
mask decoder 54. 

[0141] After that, as in the third embodiment, the ob- 
20 ject encoded data undergoes dequantization and in- 
verse discrete wavelet transformation to reconstruct im- 
age data, and the image data is shaped by the object 
shaping unit 63. The shaped image data is externally 
output via the texture output unit 864. Also, the back- 
25 ground encoded data undergoes dequantization and in- 
verse discrete wavelet transformation to reconstruct im- 
age data, and that image data is externally output via 
the texture output unit 865. With a series of operations 
mentioned above, object and background image data 
30 can be reconstructed from the conventional JPEG2000 
encoded data. 

[Ninth Embodiment] 

35 [0142] Fig. 21 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the ninth embodiment of the present invention. Note 
that the same reference numerals denote the same 
building components as in the fifth embodiment (Fig. 9) 

40 mentioned above, and a detailed description thereof will 
be omitted. 

[01 43] Referring to Fig. 21 , reference numeral 956 de- 
notes a shape information output unit for outputting gen- 
erated shape information. Reference numeral 964 de- 
45 notes a texture output unit for outputting generated ob- 
ject image data. Reference numeral 965 denotes a tex- 
ture output unit for outputting generated background im- 
age data. 

[0144] The ninth embodiment will explain a case 
so wherein JPEG2000 encoded data generated by the 
fourth embodiment is input and reproduced. 
[0145] As in the fifth embodiment that has been ex- 
plained above with reference to Fig. 9, the code input 
unit 51 receives encoded data generated by the fourth 
55 embodiment mentioned above. The input encoded data 
is supplied to the entropy decoder 151 to decode a 
header, BITS code, and data portion (see Fig. 8), and 
to decode respective bit planes. The bit plane decom- 
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position unit 152 decomposes upper and lower bit 
planes, and supplies the upper bit planes to.the shape 
extraction unit 153 and quantization value processor 
1 54, and the lower bit planes to the dequantizer 60. 
[0146] The shape extraction unit 153 generates 5 
shape information by discriminating regions inside and 
outside the object on the basis of the quantization values 
as in the fifth embodiment. The generated shape infor- 
mation is corrected by the shape information correction 
unit 55, and is input to the object shaping unit 63. Also, 10 
the shape information is externally output via the shape 
information output unit 956. 

[0147] As in the fifth embodiment mentioned above, 
the quantization values of the object encoded data are 
replaced by the quantization value processor 154, and *5 
the replaced data undergoes dequantization and in- 
verse discrete wavelet transformation to reconstruct im- 
age data. The image data is then shaped by the object 
shaping unit 63, and is externally output via the texture 
output unit 964. Also, the background encoded data un- 20 
dergoes dequantization and inverse discrete wavelet 
transformation to reconstruct image data, and the image 
data is externally output via the texture output unit 965. 
With a scries of operations mentioned above, object and 
background image data can be reconstructed from the 25 
JPEG2000 encoded data. 

[10th Embodiment] 

[01 48] As the 1 0th embodiment of the present inven- 30 
tion. the operation for reconstructing image data from 
JPEG2000 encoded data in units of frames, which are 
generated by the sixth embodiment mentioned above 
using the arrangement of the image processing appa- 
ratus shown in Fig. 10, and are stored in the storage unit 35 
504 ; will be described below with reference to the flow 
chart shown in Fig. 22. 

[0149] In step S201 , JPEG2000 encoded data select- 
ed at the terminal 506 is read out from the storage unit 
504, and is stored in the code area of the memory 501 . 40 
A header and mask information of the JPEG2000 en- 
coded data are decoded, and the decoded mask infor- 
mation is stored in the image area of the memory 501 . 
The flow then advances to step S202 to read out and 
decode encoded data of bit planes corresponding to an *s 
ROI of the JPEG2000 encoded data stored in the code 
area of the memory 501 , and to store the quantization 
result of texture of the object in the image area of the 
memory 501 . 

[0150] The flow advances to step S203 to compute so 
the expanded region expanded in step S3 in Fig. 11 on 
the basis of the mask information stored in the image 
area of the memory 501 and discrete wavelet transfor- 
mation used upon encoding, and to store the region in 
the image area of the memory 501 as a binary image. 55 
The flow advances to step S204 to generate shape in- 
formation of the object by correcting the mask informa- 
tion obtained by decoding in step S201 , i.e., by removing 
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the expanded region computed in step S203 from that 
mask information. The generated shape information is 
stored in the image area of the memory 501 , and is out- 
put to an external device, e.g., the monitor 505. 
[0151] The flow then advances to step S205 to de- 
quantize the quantization result of the object texture 
stored in the image area of the memory 501 in step 
S202, and to store the dequantization result in the image 
area of the memory 501 . The flow advances to step 
S206 to generate image data by computing the inverse 
discrete wavelet transforms of the dequantization result 
of the objecttexture generated in step S205, and to store 
that image data in the image area of the memory 501 . 
The flow advances to step S207 to replace pixel data 
corresponding to the expanded region of the object 
computed in step S203 by "0", and to store them in the 
image area of the memory 501 . The flow then advances 
to step S208 to output the stored data to an external 
device, e.g., the monitor 505. 

[01 52] The flow advances to step S209 to decode low- 
er bit planes which are stored in the code area of the 
memory 501 and remain undecoded, and to store the 
decoded data in the image area of the memory 501 . The 
flow advances to step S210. In step S210, the quanti- 
zation result of the background texture stored in the im- 
age area of the memory 501 in step S209 is dequan- 
tized, and the dequantization result is stored in the im- 
age area of the memory 501 . The flow advances to step 
S211 to generate image data by computing the inverse 
discrete wavelet transforms of the dequantization result 
of the background texture generated in step S210, and 
to store the image data in the image area of the memory 
501 . The image data is then output to an external device, 
e.g., the monitor 505. 

[01 53] Since the monitor 505 displays composite data 
of these image data, a composite image of background 
and object images can be displayed. 
[0154] With a series of operations mentioned above, 
object and background image data can be reconstruct- 
ed from the JPEG2000 encoded data. 

[11th Embodiment] 

[0155] Fig. 24 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the 11th embodiment of the present invention. Note 
that the 11th embodiment will explain a case wherein 
MPEG-4 encoded data is input and encoded, and is out- 
put as JPEG2000 encoded data. 
[0156] Referring to Fig. 24, reference numeral 2401 
denotes an encoded data input unit for inputting MPEG- 
4 encoded data. Reference numeral 2402 denotes a de- 
multiplexerfor demultiplexing the input MPEG-4 encod- 
ed data, and supplying the demultiplexed data to re- 
spective units. Reference numeral 2403 denotes a 
shape code decoder for receiving and decoding shape 
encoded data of an object, which is encoded by MPEG- 
4 and is demultiplexed by the demultiplexer 2402. Ref- 
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erence numeral 2404 denotes a texture decoder for de- 
coding the texture of the object demultiplexed by the de- 
multiplexer 2402. Reference numeral 2405 denotes a 
texture decoderfor decoding the texture of encoded da- 
ta of a background image demultiplexed by the demul- 5 
tiplexer 2402. Reference numeral 2406 denotes an au- 
dio buffer for storing audio encoded data. In this embod- 
iment, the audio encoded data is encoded by HVXC, i. 
e., has undergone very low-bit encoding. Reference nu- 
meral 2407 denotes an image composition unit for su- 10 
perposing the object texture decoded by the texture de- 
coder 2404 on the background image texture decoded 
by the texture decoder 2405 in accordance with the 
shape information decoded by the shape code decoder 
2403. Reference numeral 2408 denotes a discrete is 
wavelet transformer for computing the discrete wavelet 
transforms of input image data. Reference numeral 
2409 denotes a quantizer for receiving and quantizing 
iransform coefficients computed by the discrete wavelet 
transformer 2408. Reference numeral 2410 denotes a 20 
bit shift unit for shifting bit planes on the basis of the 
quantization result of the quantizer 2409 in accordance 
with the number of bits that form the bit planes and the 
mask information decoded by the shape code decoder 
2403. Reference numeral 2411 denotes a bit plane com- 25 
position unit for composing the contents of the audio 
buffer 2406 by stuffing them in the order of bits in lower 
bits of a region designated as an object by the mask 
information in accordance with the number of bits that 
form the bit planes and the mask information decoded 30 
by the shape code decoder 2403. Reference numeral 
241 2 denotes a mask encoder for encoding mask infor- 
mation that represents the ROI shape and position. Ref- 
erence numeral 2413 denotes an entropy encoder for 
encoding data composed by the bit plane composition 
unit 2411 in units of bit planes. Reference numeral 2414 
denotes a multiplexer for shaping the outputs from the 
mask encoder 2412 and entropy encoder 2413 to en- 
coded data according to the JPEG2000 format. Refer- 
ence numeral 2415 denotes a code output unit for out- 
putting the generated encoded data. 
[01 57] The operation of the aforementioned arrange- 
ment will be explained below. In this embodiment, a 
process of MPEG-4 encoded data for each frame will 
be explained. By repeating this process in correspond- 
ence with the number of frames, all data can be proc- 
essed. 

[0158] The encoded data input unit 2401 inputs 
MPEG-4 encoded data consisting of one object, back- 
ground image, and audio encoded data in a core profile. 
The input encoded data is supplied to the demultiplexer 

2402, and is demultiplexed into encoded data which per- 
tains to a shape code of the object, encoded data that 
pertains to the texture of the object, encoded data that 
pertains to the texture of the background, and audio en- 
coded data. The encoded data that pertains to the shape 
code of the object is supplied to the shape code decoder 

2403, the encoded data that pertains to the object tex- 



ture to the texture decoder 2404, the encoded data that 
pertains to the background texture to the texture decod- 
er 2405, and the audio encoded data to the audio buffer 
2406. 

[0159] The shape code decoder 2403 decodes binary 
information that represents the object shape. In this em- 
bodiment, shape data shown in, e.g., Fig. 25B will be 
exemplified as such shape information. 
[0160] Since the decoded shape information serves 
as ROI mask information, it is input to the mask encoder 
2412, and is encoded according to the JPEG2000 for- 
mat. 

[0161] The texture decoder 2404 decodes the object 
texture. In this embodiment, texture shown in Fig. 25A 
will be exemplified as an example of the shape informa- 
tion. The texture decoder 2405 decodes the background 
texture. In this embodiment, texture shown in Fig. 25C 
will be exemplified as an example of the shape informa- 
tion. The image composition unit 2407 composites the 
object texture with the background image texture in ac- 
cordance with the shape information decoded by the 
shape code decoder 2403. 

[01 62] Fig. 1 8 mentioned previously shows this proc- 
ess. The discrete wavelet transformer 2408 computes 
the discrete wavelet transforms of the composite image 
data. 

[0163] The quantizer 2409 receives the output from 
the discrete wavelet transformer 2408, and quantizes it 
by predetermined quantization coefficients. The quanti- 
zation result of the quantizer 2409 is input to the bit shift 
unit 2410. Also, the number of bits required to express 
the quantization result is input to the multiplexer 2414. 
[01 64] The bit shift unit 241 0 prepares bit planes, the 
number of which is twice the number of bits computed 
by the quantizer 2409, while setting the region of the 
background texture corresponding to the object as a re- 
gion of interest on the basis of the quantization result 
input from the quantizer 2409 and the shape information 
input from the shape code decoder 2403, and shifts the 
object portion to upper bits in accordance with the shape 
information from the shape code decoder 2403. Fig. 19 
shows this process taking an LL frequency band as an 
example. 

[0165] In this manner, the bit shift unit 241 0 stuffs the 
quantization result of the background texture in the low- 
er bits of a region that does not overlap the object, and 
stuffs "0"s in their upper bits on the basis of the shape 
information. Also, the bit shift unit 2410 outputs the 
quantization result of the object to the upper bits of the 
overlapping region, and stuffs "0"s in their lower bits. 
[0166] The bit plane composition unit 241 1 reads out 
the audio encoded data for one frame interval from the 
audio buffer to the lower bits at the position of the object 
on the basis of the image data input from the bit shift 
unit 2410 and the shape information decoded by the 
shape code decoder 2403, and replaces the lower bits 
by the audio encoded data for each bit in the order of 
scan lines. 
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[0167] A process until the bit data shown in Fig. 19 is 
generated will be briefly explained below. In order to en- 
code both the object and background, the object and 
background texture corresponding to a region outside 
the object region are composed, and the composite data 
undergoes frequency transformation to generate trans- 
form coefficients. Of these transform coefficients, bits 
corresponding to the object region are shifted to upper 
bit plane, and "0" bits are stuffed in the blank fields 202 
outside the object region, which are generated by the 
bit shift process. In addition, the audio encoded data for 
one frame time is stuffed in the blank fields 203 within 
the object region, which are generated by the bit shift 
process. 

[0168] The entropy encoder 2413 encodes bit planes 
in turn from the MSB side, and supplies the encoding 
result to the multiplexer 2414. The multiplexer 2414 
shapes the input data to encoded data according to the 
JPEG2000 format. 

[0169] The process until encoding according to the 
11th embodiment of the present invention will be ex- 
plained below with reference to the flow chart shown in 

Fig. 26. 

[0170] In stop S401 , the object, background, and au- 
dio encoded data are acquired to decode the MPEG-4 
encoded data. The flow advances to step S402 to de- 
code these object, background. In step S403, the object 
and background are composed, and the composite im- 
age undergoes frequency transformation to generate 
transform coefficients. The flow advances to step S404 
to bit-shift bits corresponding to the object region of 
these transform coefficients to upper bit planes, and to 
stuff "0" bits in the blank fields 202 (Fig. 1 9) outside the 
object region, which are generated by the bit shift proc- 
ess. The flow advances to step S405 to stuff the audio 
encoded data in the blank fields 203 (Fig. 1 9) within the 
object region, which are generated by the bit shift proc- 
ess. Finally, the flow advances to step S406 to encode 
the bit data shown in Fig. 19 obtained in this way in turn 
from a bit plane on the MSB side by entropy coding. 
[01 71 ] Fig. 27 shows an output example of the encod- 
ed data obtained by the aforementioned encoding proc- 
ess. 

[0172] In Fig. 27, a header including a code which in- 
dicates information of the size of the encoded image or 
the like is followed by a BITS code indicating the number 
of bit planes. Then, the encoding result of the mask in- 
formation output from the mask encoder 2412 appears, 
and a SHIFT code indicating the presence of audio en- 
coded data in the lower bits of the object then follows. 
The entropy encoding result is separated into subbands 
(LL to HH1 ), each of which consists of encoded data for 
1 6 bit planes. The multiplexed encoded data is external- 
ly output via the code output unit 2415. 
[0173] With a series of operations mentioned above, 
audio encoded data can be appended to image data in 
which only "0 u s are stuffed in the conventional process, 
and the audio information can be reproduced in syn- 



chronism with a reproduced moving image. 
[0174] In the 11th embodiment, MPEG-4 encoded da- 
ta is input, and JPEG2000 encoded data is output. How- 
ever, the present invention is not limited to such specific 
5 data. 

[0175] Furthermore, in the 11th embodiment, the 
quantizer 2409 is provided to improve coding efficiency. 
However, the quantizer may be omitted to obtain revers- 
ible codes free from any deterioration. 
10 [0176] In the 11th embodiment, audio data is exem- 
plified as data to be appended, but other kinds of infor- 
mation may be appended. 

[0177] In the aforementioned arrangement, some or 
all functions may be implemented by software orthe like. 

15 

[12th Embodiment] 

[0178] Fig. 28 is a block diagram showing the ar- 
rangement of an image processing apparatus according 

20 to the 12th embodiment of the present invention. The 
12th embodiment will explain a case wherein 
JPEG2000 encoded data generated by the 1 1 th embod- 
iment is input, and a moving image is reproduced. 
[0179] Referring to Fig. 28, reference numeral 2851 

25 denotes a code input unit for inputting JPEG2000 en- 
coded data generated by the 11th embodiment. Refer- 
ence numeral 2852 denotes a demultiplexer for demul- 
tiplexing the input encoded data, and supplying the de- 
multiplexed data to respective units. Reference numeral 

30 2853 denotes a mask decoder for decoding mask infor- 
mation which represents the ROI shape and position, a 
BITS code that indicates the number of bits of the entire 
data, and a SHIFT code. Reference numeral 2854 de- 
notes an entropy decoder for decoding encoded data in 

35 units of bit planes. Reference numeral 2855 denotes a 
data demultiplexer for demultiplexing encoded data into 
bit planes of the ROI portion, bit planes of the remaining 
portion (background portion), and audio encoded data, 
and outputting them to a bit shift unit 2856 and audio 

40 buffer 2861 . The bit shift unit 2856 bit-shifts the ROI por- 
tion in the lower (LSB) direction. A dequantizer2857 de- 
quantizes the quantization result of the quantizer 2409. 
Reference numeral 2858 denotes an inverse discrete 
wavelet transformer for making inverse discrete wavelet 

45 transformation of the discrete wavelet transformation in 
the discrete wavelet transformer 2408. Reference nu- 
meral 2859 denotes a frame memory for storing decod- 
ed image data. Reference numeral 2860 denotes a dis- 
play for displaying the contents of the frame memory 

50 2859. Reference numeral 2861 denotes an audio buffer 
for storing the audio encoded data demultiplexed by the 
data demultiplexer 2855. Reference numeral 2862 de- 
notes an audio decoder for decoding audio data. Refer- 
ence numeral 2863 denotes a sound device for convert- 

55 ing the decoded audio data into audible sound, and re- 
producing the sound. 

[0180] In the aforementioned arrangement, the code 
input unit 2851 inputs encoded data generated by the 
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1 1th embodiment. The input encoded data is input to the 
demultiplexer 2852 to decode a header, thus acquiring 
required information and supplying such information to 
respective units. Furthermore, encoded data of a BITS 
code. SHIFT code, and mask information are input to 
the mask decoder 2853, and the remaining data is input 
to the entropy decoder 2854. 

[0181] The mask decoder 2853 decodes the SHIFT 
code to check if audio encoded data is appended to the 
lower bits of the ROI portion. If it is determined that no 
audio encoded data is appended, a normal ROI process 
of JPEG2000 is executed. On the other hand, if it is de- 
termined that audio encoded data is appended, that au- 
dio encoded data is demultiplexed to reproduce audio. 
[01 82] A case will be explained first wherein audio en- 
coded data is appended. 

[0183] The mask decoder 2853 decodes mask infor- 
mation indicating the ROI shape and position, and the 
BITS code indicating the number of bits of the entire da- 
ta. 

[0184] On the other hand, the entropy decoder 2854 
decodes bit planes in turn from the MSB side, and inputs 
the decoding result to the data demultiplexer 2855. Fig. 
1 9 shows bit plane data decoded in this way. In Fig. 1 9, 
texture data of the object 200 and data 202 stuffed with 
"0"s are input to the bit shift unit 2856. 
[01 85] The texture data 204 of the background image 
in Fig. 19, and stuffed audio encoded data 203 are de- 
multiplexed in accordance with the shape information 
decoded by the mask decoder 2853, and are respec- 
tively supplied to the bit shift unit 2856 and audio buffer 
2861. 

[01 86] The bit shift unit 2856 shifts the bits of the ROI 
portion to the LSB side to generate bit data shown in 
Fig. 18, and inputs that data to the dequantizer 2857. 
The dequantizer 2857 executes dequantization of the 
quantization of the quantizer 2409 (Fig. 24), and its de- 
quantization result is supplied to the inverse discrete 
wavelet transformer 2858. The inverse discrete wavelet 
transformer 2858 reconstructs texture data by comput- 
ing the inverse discrete wavelet transforms of the inputs, 
and stores it in the frame memory 2859. The image data 
stored in this manner is displayed on the display 2860. 
At the same time, the audio encoded data stored in the 
audio buffer 2861 is decoded by the audio decoder 2862 
and is reproduced by the sound device 2863. 
[0187] A case will be described below wherein the 
mask decoder 2853 determines that no audio encoded 
data is appended. 

[0188] In this case, the mask decoder 2853 controls 
not to operate the data demultiplexer 2855, audio buffer 
2861 , audio decoder2862, and sound device 2863. The 
bit shift unit 2856 is controlled to execute a normal ROI 
process of JPEG2000. 

[0189] The mask decoder 2853 decodes mask infor- 
mation indicating the ROI shape and position, and the 
BITS code indicating the number of bits of the entire da- 
ta. The entropy decoder 2854 decodes bit planes in turn 
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from the MSB side, and inputs the decoding result to the 
bit shift unit 2856 via the data demultiplexer 2855. The 
bit shift unit 2856 receives the bit plane data similar to 
that shown in Fig. 19. In this case, "0"s are stuffed in 

5 place of the audio encoded data 203 in Fig. 19. 

[0190] In Fig. 19, the texture data 200 of the object is 
shifted to the lower bits in accordance with the shape 
information and the number of bits decoded by the mask 
decoder. The bit plane data at that time has the bit plane 

10 configuration shown in Fig. 18. 

[0191] The dequantizer 2857 dequantizes the input 
that has undergone the bit shift process toward the LSB 
side, and the inverse wavelet transformer 2858 com- 
putes the inverse discrete wavelet transforms. The im- 

15 age data that has undergone the inverse discrete wave- 
let transformation is stored in the frame memory 2859. 
The image data stored in the frame memory 2859 in this 
way is displayed by the display 2860. 
[0192] As the characteristic feature of the ROI, even 

20 when this decoding process is aborted, an image can 
be reclaimed by decoding only upper bits irrespective of 
the presence/absence of audio data. 
[0193] The aforementioned process until reproduc- 
tion will be explained below with reference to the flow 

25 chart shown in Fig. 29. 

[0194] Referring to Fig. 29, in step S501 JPEG2000 
encoded data is read out to decode the SHIFT code, 
and to check if audio encoded data is appended. If it is 
determined that no audio encoded data is appended, 

30 the flow advances to step S507 to execute a normal de- 
coding process of JPEG2000 encoded data, thus re- 
claiming image data. 

[0195] On the other hand, if it is determined in step 
S501 that audio encoded data is appended, the flow ad- 

35 vances to step S502 to decode a header and mask in- 
formation contained in that encoded data. The flow ad- 
vances to step S503 to decode bit plane data, and to 
demultiplex them into texture data and audio encoded 
data. The flow advances to step S504 to reconstruct and 

40 display the texture data on the display 2860. At the same 
time, the audio encoded data is decoded and the audio 
data is reproduced by the sound device 2863 in step 
S505. Finally, it is checked in step S506 if all frames 
have been processed. If frame data to be decoded still 

45 remain, the flow returns to step S501 to repeat the afore- 
mentioned process; if all frame data have been decod- 
ed, this process ends. 

[0196] With a series of operations mentioned above, 
both image and audio data can be reproduced while 
so maintaining compatibility to the conventional JPEG2000 
encoded data. 

[0197] In the 12th embodiment, JPEG2000 encoded 
data is input, but the present invention is not limited to 
such specific data. In the above arrangement, some or 
55 all functions maybe implemented by software or the like. 
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[13th Embodiment] 

[0198] Fig. 30 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the 13th embodiment of the present invention. Note 
that the same reference numerals denote the same 
building components as in the 11th embodiment above, 
and a detailed description thereof will be omitted. The 
13th embodiment will exemplify a case wherein image 
data sensed by a camera 3031 is input and encoded, 
information which is helpful in, e.g., search is appended 
to the encoded data ; and that encoded data is output. 
[0199] Referring to Fig. 30, reference numeral 3031 
denotes a camera for generating an image signal by 
capturing an image. Reference numeral 3032 denotes 
a frame memory for storing the captured image data in 
units of frames. Reference numeral 3033 denotes a ter- 
minal at which the user inputs information helpful in 
search. The user can input from this terminal 3033 meta 
information such as information that pertains to the im- 
age sensing date ; place, photographer, image sensing 
condition, and object upon sensing an image using the 
camera 3031 . Reference numeral 3034 denotes a mem- 
ory for storing information input from the terminal 3033. 
Reference numeral 3035 denotes a region setting unit 
for displaying image data captured by the camera 3031 
and allowing the user to set a region of interest (ROI) 
using an input device such as a digitizer or the like. The 
ROI is an image region which is to be preferentially en- 
coded/decoded. Reference numeral 3036 denotes a re- 
gion memory for holding ROI information set by the re- 
gion setting unit 3035. Reference numeral 3037 denotes 
a bit plane composition unit for composing the contents 
of the memory 3034 with image data by stuffing the con- 
tents in the lower bits of the ROI in accordance with the 
number of bits which form bit planes, and the contents 
of the region memory 3036. 

[0200] The operation of the image processing appa- 
ratus with the above arrangement will be described be- 
low. 

[0201] Image data captured by the camera 3031 is 
temporarily stored in the frame memory 3032, and that 
image is displayed on the region setting unit 3035. When 
the user designates the region of interest (ROI) using 
the region setting unit 3035 with reference to the dis- 
played image, data indicating the ROI is stored in the 
region memory 3036. The discrete wavelet transformer 

2408 computes the discrete wavelet transforms of the 
contents of the frame memory 3032, and the quantizer 

2409 quantizes the computed transform coefficients. 
The bit shift unit 241 0 bit-shifts the transform coefficients 
contained inside the ROI to the. MSB side in accordance 
with the region information which is set and stored in the 
region memory 3036. 

[0202] At the same time, the user inputs from the ter- 
minal 3033 information that pertains to the date, place, 
photographer, image sensing condition, and object up- 
on sensing the image using the camera 3031, and 
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stores that information in the memory 3034. 
[0203] The bit plane composition unit 3037 writes the 
meta information supplied from the memory 3034 in the 
lower bits, which are left blank after the transform coef- 

s ficients contained in the ROI are shifted, bit by bit in the 
order of scan lines, thus generating composite data of 
the transform coefficients of image data and meta infor- 
mation, as in the 11th embodiment. The entropy encod- 
er 2413 encodes these data, and externally outputs the 

10 encoded data via the code output unit 2415. 

[0204] An output example of the encoded data ob- 
tained by the aforementioned encoding process is the 
same as that shown in Fig. 8. 

[0205] Referring to Fig. 8, reference numeral 801 de- 
15 notes a header containing a code which indicates infor- 
mation of the size of the encoded image orthe like. Ref- 
erence numeral 802 denotes a BITS code as the encod- 
ing result of the number of bit planes. Reference numer- 
al 803 denotes data that stores the entropy encoding 
20 result of each bit plane. The entropy- encoding result is 
separated into bit planes, each of which consists of en- 
coded data for respective subbands. 
[0206] As described above, according to the 13th em- 
bodiment, encoded data obtained by appending infor- 
ms mation required for search to captured image data can 
be generated while maintaining compatibility to the con- 
ventional JPEG2000 encoded data. 
[0207] In the 1 3th embodiment, the quantizer 2409 is 
provided to improve coding efficiency. However the 
30 quantizer 2409 may be omitted to obtain reversible 
codes free from any deterioration. 
[0208] In the 13th embodiment, meta information is 
exemplified as data to be appended. However, the 
present invention is not limited to such specific data. For 
35 example, audio data may be appended as in the 11th 
embodiment, or other kinds of information may be ap- 
pended. In the aforementioned arrangement, some or 
all functions may be implemented by software orthe like. 

40 [14th Embodiment] 

[0209] Fig. 31 is a block diagram showing the ar- 
rangement of an image processing apparatus according 
to the 14th embodiment of the present invention. Note 

45 that the same reference numerals denote the same 
building components as in the 12th embodiment (Fig. 
28), and a detailed description thereof will be omitted. 
The 14th embodiment will explain a case wherein 
JPEG2000 encoded data generated by the 1 3th embod- 

50 iment is input, and an image is reproduced. 

[0210] Referring to Fig. 31, reference numeral 3151 
denotes an entropy decoder for decoding encoded data 
of a header and bit planes. Reference numeral 3152 de- 
notes a frame memory for storing image data decoded 

55 by the entropy decoder 3151. Reference numeral 3153 
denotes an ROI extraction unit for extracting an ROI 
from the contents of the frame memory 31 52. Reference 
numeral 31 54 denotes a meta information extractor for 
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extracting meta information from the contents of the 
frame memory 3152. Reference numeral 3155 denotes 
a display for displaying image data and meta informa- 
tion. 

[0211] In such arrangement, the code input unit 2851 
inputs encoded data generated by the 13th embodiment 
above. The input encoded data is supplied to the entro- 
py decoder 3151 to decode a header and BITS code, 
thus obtaining required information. Then, the encoded 
data is decoded into bit plane data, which are stored in 
the frame memory 3152. 

[0212] The ROI extraction unit 3153 reads out bit 
planes obtained by encoding the ROI on the basis of the 
number of bits obtained by decoding the BITS code, and 
determines the ROI by collecting pixels with nonzero 
values. Therefore, by replacing pixels with nonzero val- 
ues by "1", and pixels with values "0" by "0", binary in- 
formation indicating the ROI can be extracted. The ex- 
tracted ROI information is input to the bit shift unit 2856 
and metal information extractor 3154. 
[0213] The bit shift unit 2856 shifts the ROI data to- 
ward the LSB side as in the 12th embodiment, the de- 
quantizer 2857 dequantizes the shifted data, and the in- 
verse discrete wavelet transformer 2858 reconstructs 
image data. The reconstructed image data is stored in 
the frame memory 2859. 

[0214] On the other hand, the meta information ex- 
tractor 3154 reconstructs meta information by reading 
out the meta information in the lower bits of the ROI in 
the order of bit planes and scan lines. The reconstructed 
image data and meta information are input to the display 
3155, which displays the image and meta information. 
[0215] As a characteristic feature of the ROI, even 
when this decoding process is aborted, an image can 
be reproduced by decoding only upper bits irrespective 
of the presence/absence of audio data. With a series of 
operations mentioned above, both image data and meta 
information can be reconstructed while maintaining 
compatibility to the conventional JPEG2000 encoded 
data. In this way, many kinds of information can be pro- 
vided to the user, and search can be easily made using, 
e.g., keywords. 

[0216] In the 14th embodiment, JPEG2000 encoded 
data is input, but the present invention is not limited to 
such specific data. In the 14th embodiment, text infor- 
mation is input from the terminal, but the present inven- 
tion is not limited to such specific information. For ex- 
ample, meta information specified by MPEG-7 may be 
input. In the aforementioned arrangement, some or all 
functions may be implemented by software or the like. 

[15th Embodiment] 

[0217] An image processing apparatus according to 
the 1 5th embodiment of the present invention will be ex- 
plained below. This image processing apparatus has the 
same arrangement as that shown in Fig. 10. 
[0218] The operation for converting still image data 



stored in the storage unit 504 into JPEG2000 encoded 
data by the CPU 500 will be explained below with refer- 
ence to the flow chart shown in Fig. 32. 
[0219] In step S601 , image data selected at the ter- 

5 minal 506 is read out from the storage unit 504, and is 
stored in the image area of the memory 501 . The flow 
advances to step S602 to display an image based on 
the image data on the monitor 505, and make the user 
set an ROI of that image from the terminal 506 using, e. 

10 g. t a digitizer or the like. As the ROI, a shape data field 
is assured on the memory 501 , and shape data which 
assumes "1" for pixels inside the ROI and "0" for other 
pixels is stored as binary shape information. The flow 
advances to step S603 to make the user input security 

15 information such as copyright information or the like 
from the terminal 506. This security information is, e.g., 
a password, based on which an encryption key is gen- 
erated. Also, the copyright information is encrypted, and 
that encrypted data is stored in a data area assured on 

20 the memory 501 for respective bits. Let bs (bits) be the 
information volume at that time. 

[0220] The flow advances to step S604 to assure ROI 
and BG fields on the image area on the memory 501 , 
and to store image data contained in the ROI in the ROI 

25 field in accordance with the shape data field. Also, im- 
age data outside the ROI is stored in the BG field. As a 
result, a composite image of the texture data 200 in Fig. 
1 9 and the blank fields 202 outside the ROI is obtained. 
[0221] The flow advances to step S605 to scramble 

30 the image data in the BG field in accordance with the 
aforementioned password. The flow advances to step 
S606 to encode the entire image data by JPEG2000. 
The encoded data is saved or sent in step S607. 
[0222] Figs. 33 and 34 are flow charts showing the 

35 encoding process in step S606 in Fig. 32. Let n be the 
bit depth of the BG field, and m be the bit depth of the 
entire image data in the BG and ROI fields (see Fig. 1 9). 
Also, let x_size and y_size be the sizes of the image in 
the main scan and sub-scan directions. 

40 [0223] In step S61 0, "m" is substituted in a variable z 
for counting the bit depth, M 0" in a variable x that indi- 
cates the pixel position in the main scan direction, and 
"0" in a variable y indicating the pixel position in the sub- 
scan direction. The flow advances to step S61 1 to check 

45 if the variable z falls within the range between "n" and 
"m-1". If the variable z falls within this range, the flow 
advances to step S612; otherwise, it is determined that 
the ROI process ends, and the flow advances to step 
S622 (Fig. 34). 

so [0224] It is checked in step S612 if y < y_size. If NO 
in step S61 2, the flow advances to step S620. Since the 
process for all the bits of the bit plane to be processed 
is complete, z is decremented by -1", and the flow re- 
turns to step S611 . 

55 [0225] If y > y_size in step S612, the flow advances 
to step S613to check if x < x_size. If YES in step S613, 
the flow advances to step S615; otherwise, the flow ad- 
vances to step S614. In step S614, since the process 
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for all the bits of the bit plane to be processed in the 
main scan direction is complete, y is incremented by "1", 
and the flow returns to step S612. 
[0226] On the other hand, if y < y_size in step S612, 
corresponding pixel information in the shape data field 
(Shape(x, y)) on the memory 501 is read out in step 
S61 5. If that pixel data is "1", the flow advances to step 
S616; otherwise, the flow advances to step S617. In 
step S616, since the pixel to be processed falls within 
the ROI , the corresponding bit of the corresponding pix- 
el in the ROI is substituted in a variable T. In step S617, 
since the pixel to be processed falls outside the ROI, "0" 
is substituted in the variable T. 

[0227] Upon completion of step S616 or S617, the 
flow advances to step S618 to encode the pixel by 
JPEG2000. The flow advances to step S619, and x is 
incremented by "1". The flow then returns to step S613 
to compare x with x_size. 

[0228] If the variable z falls outside the range from "n" 
to "m-1" in step S611, the flow advances to step S622 
to substitute "0" in x and y, and a variable A for counting 
the number of bits of the data field on the memory 501 . 
Tho flow advances to step S623 to check if z > "0". If 
YES in step S623. the flow advances to step S624; oth- 
erwise, it is determined that the encoding process for 
the entire image is complete, and the operation ends. 
[0229] If y < y.size in step S624, the flow advances 
to step S626: otherwise, the flow advances to step 
S625. Since the process for all the bits of the bit plane 
to be processed is complete, z is decremented by "1", 
and the flow returns to step S623. It is checked in step 
S626 if x < x size. If YES in step S626, the flow advanc- 
es to step S628; otherwise, the flow advances to step 
S627. Since the process for all the bits of the bit plane 
to be processed in the main scan direction is complete, 
y is incremented by "1", and the flow returns to step 
S624. 

[0230] In step S628, the corresponding pixel informa- 
tion (Shape(x, y)) of the shape data field is read out. If 
that information is "1", the flow advances to step S629; 
otherwise, the flow advances to step S630. In step 

5630, since the pixel to be processed falls outside the 
ROI, the corresponding bit (BG(x t y, z)) of the corre- 
sponding pixel in the BG field is substituted in the vari- 
able T. It is checked in step S629 If A < bs. If YES in step 
S629, the flow advances to step S631; otherwise, the 
flow advances to step S632. In step S631 , since the bit 
to be processed is encrypted data, the A-th bit of the 
data field is substituted in the variable T, and the variable 
A is incremented by +1 . If A > bs, the flow advances to 
step S632, and since the encrypted data has been proc- 
essed, "0" is substituted in the variable T. 

[0231] Upon completion of the process in step S630, 

5631 , or S632, the flow advances to step S633 to en- 
code the pixel by JPEG2000. The flow advances to step 
S634 to increment the variable x by "1", and the flow 
returns to step S628 to compare the variable x with 
x_size. 



[0232] If it is determined in step S623 that the process 
for all the bits is complete, the encoding process ends. 
The encoded data generated in this way is stored or 
saved in the storage unit 504, and is output onto the 

5 communication line 508 via the communication inter- 
face 507 in accordance with a user's instruction. 
[0233] With a series of operations mentioned above, 
copyright information can be efficiently appended to im- 
age data while maintaining compatibility to the conven- 

10 tionat JPEG2000 encoded data. In this way, many kinds 
of information can be provided to the user, and copyright 
protection and security management of information can 
be easily implemented. 

[0234] In the 15th embodiment, JPEG2000 encoded 
15 data is output as an encoding result. However, the 
present invention is not limited to such specific data. In 
the aforementioned arrangement, some or all functions 
may be implemented by hardware or the like. 



[0235] As the 1 6th embodiment of the present inven- 
tion , the operation for decoding J P EG2000 encoded da- 
ta, which is generated by the 15th embodiment using 
the arrangement of the image processing apparatus 
shown in Fig. 10, and is stored in the storage unit 504 
will be described below with reference to the flow chart 
shown in Fig. 35. 

[0236] In step S701 , JPEG2000 encoded data select- 
ed at the terminal 506 is read out from the storage unit 
504, and is stored in the code area of the memory 501 . 
The flow advances to step S702 to decode the encoded 
data stored in the code area by JPEG2000. 
[0237] The decoding process in step S702 will be de- 
scribed below with reference to the flow charts shown 
in Figs. 36 and 37. 

[0238] In step S801 , "m\ "0", and "0" are respectively 
substituted in variables z, x, and y. The flow advances 
to step S802 to clear the shape data field and ROI field 
on the memory 501 to "0". The flow advances to step 
S803 to check if the variable z falls within the range from 
"n" to "m-1". If YES in step S803, the flow advances to 
step S804; otherwise, it is determined that the process 
of the ROI is complete, and the flow advances to step 
S813 (Fig. 37). 

[0239] It is checked in step S804 if y < y_size. If YES 
in step S804, the flow advances to step S805; otherwise, 
the flow advances to step S812. Since the process for 
all the bits of the bit plane to be processed is complete, 
the variable z is decremented by "1", and the flow re- 
turns to step S803. 

[0240] It is checked in step S805 if x < x_size. If YES 
in step S805, the flow advances to step S806; otherwise, 
the flow advances to step S811 . Since the process for 
all the bits of the bit plane to be processed in the main 
scan direction is complete, the variable y is incremented 
by "1", and the flow returns to step S804. In step S806, 
T as 1-bit data is decoded by JPEG2000. 
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[0241 ] The flow advances to step S807, and if T = " 1 ", 
the flow advances to step S808 to write "1 " in a bit of the 
corresponding pixel in the shape data field. On the other 
hand, if T * "1", the flow advances to step S810. After 
step S808, the flow advances to step S809 to write "1" 5 
in bit information of the corresponding pixel in the ROI 
field. The flow then advances to step S81 0 to increment 
the variable x by "1", and the flow returns to step S805 
to repeat the aforementioned process for comparing the 
variable x with x_size. 10 
[0242] It is determined in step S803 that the variable 
z falls outside the range from "n" to "m-1", the flow ad- 
vances to step S813 to substitute "0" in variables A, x, 
and y. The flow advances to step S814 to check if z > 
"0 M . If YES in step S8 1 4, the flow advances to step S8 1 5; is 
otherwise, it is determined that the decoding process for 
the entire image is complete, and the process ends. 
[0243] It is checked in step S815 if y < y_size. If YES 
in slep S815, Ihe flow advances to step S816 to check 
if x < x_size. If y > y_size in step S81 5 , the flow advances 20 
to step S823 to decrement the variable z by "1" since 
the process for all the bits of the bit plane to be proc- 
essed is complete. The flow then returns to step S814. 
[0244] If x < x_sizc in step S81 6, the flow advances 
to step S817 to execute the decoding process; other- 25 
wise, the flow advances to step S822 to increment the 
variable y by "1 since the process for all the bits of the 
bit plane to be processed in the main scan direction is 
complete. The flow then returns to step S815. 
[0245] After T as 1 -bit data is decoded by J P EG2000 30 
in step S817. the flow advances to step S818 to read 
out the corresponding pixel information of the data 
shape field of the memory 501. If the value of that infor- 
mation is "1", the flow advances to step S81 9. Since the 
pixel to be processed falls within the ROI, T is substitut- 35 
ed in the A-th bit of the data field of the memory 501 , 
the variable A is incremented by +1 , and "0" is substi- 
tuted in the corresponding bit of the corresponding pixel 
in the BG field. If the corresponding pixel information of 
the data shape field is not w 1 " in step S81 8, the flow ad- 40 
vances to step S820. Since the pixel to be processed 
falls outside the ROI : T is substituted in the correspond- 
ing bit of the corresponding pixel of the BG field. Upon 
completion of the process in step S81 9 or S820, the flow 
advances to step S821 to increment the variable x by <*5 
"1", and the flow returns to step S816 to compare the 
variable x with x_size, 

[0246] In this way, if it is determined in step S81 4 that 
the process for all the bits is complete, the decoding 
process ends. 50 
[0247] Referring back to Fig. 35, security information 
(password in this example) is input in step S703. The 
flow then advances to step S704 to authenticate the de- 
coded data. If the authentication result is GOOD, the 
flow advances to step S705. In step S705, the image in 55 
the BG field is descrambled, and the descrambied im- 
age data is stored in the BG field. On the other hand, if 
the authentication result is NG in step S704, the flow 



40 

jumps to step S706 to display the scrambled image in 
the BG field. 

[0248] In this manner, the decoded image data in the 
ROI and BG field can be displayed on the monitor 505, 
stored or saved in the storage unit 504, or output onto 
the communication line 508 via the communication in- 
terface 507 in accordance with the information in the 
shape data field. 

[0249] With a sehes of operations mentioned above, 
copyright information can be efficiently appended to im- 
age data while maintaining compatibility to the conven- 
tional JPEG2000 encoded data. Since security informa- 
tion is appended, image data can be easily reconstruct- 
ed in correspondence with the required security level. 
[0250] In the 16th embodiment, JPEG2000 encoded 
data is input, but the present invention is not limited to 
such specific data. In the aforementioned arrangement, 
some or all functions may be implemented by hardware 
or the like. 

[0251 ] Note that the present invention may be applied 
to either a system constituted by a plurality of devices 
(e.g., a host computer, interface device, reader, video 
camera, video cassette recorder, printer, and the like), 
or an apparatus consisting of a single equipment (e.g., 
a copying machine, facsimile apparatus, video camera, 
video cassette recorder, or the like). 
[0252] The objects of the present invention are also 
achieved by supplying a storage medium (or recording 
medium), which records a program code of a software 
program that can implement the functions of the above- 
mentioned embodiments to the system or apparatus, 
and reading out and executing the program code stored 
in the storage medium by a computer (or a CPU or MPU) 
of the system or apparatus. In this case, the program 
code itself read out from the storage medium imple- 
ments the functions of the above-mentioned embodi- 
ments, and the storage medium which stores the pro- 
gram code constitutes the present invention. The func- 
tions of the above-mentioned embodiments may be im- 
plemented not only by executing the readout program 
code by the computer but also by some or all of actual 
processing operations executed by an operating system 
(OS) running on the computer on the basis of an instruc- 
tion of the program code. 

[0253] Furthermore, the functions of the above-men- 
tioned embodiments may be implemented by some or 
ail of actual processing operations executed by a CPU 
or the like arranged in a function extension card or a 
function extension unit, which is inserted in orconnected 
to the computer, after the program code read out from 
the storage medium is written in a memory of the exten- 
sion card or unit. 

[0254] For the sake of simplicity in the description of 
the present invention, each embodiment has explained 
a case wherein one object is contained. However, a plu- 
rality of objects can be processed by executing the same 
process for each object. 

[0255] In the descriptions of the above embodiments, 
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the respective embodiments have been independently 
explained. However, the present invention is not limited 
to such specific embodiments, and these embodiments 
may be implemented solely or in combination as need- 
ed. 

[0256] To restate, according to the above embodi- 
ments, since data of an occluded portion of the back- 
ground image is inserted in lower bit planes of encoded 
data having an ROI function like JPEG2000, data can 
be encoded while maintaining both the object and back- 
ground, 

[0257] Also, re-conversion to object encoded data 
such as MPEG-4 can be easily done. 
[0258] The present invention is not limited to the 
above embodiments and various changes and modifi- 
cations can be made within the spirit and scope of the 
present invention. Therefore, to apprise the public of the 
scope of the present invention, the following claims are 
made. 



Claims 

1 . An image processing apparatus characterised by 
comprising: 

image input means for inputting image data; 
information input means for inputting informa- 
tion data; 

region of interest setting means for setting a re- 
gion of interest on the basis of the image data; 
transformation means for generating transform 
coefficients by computing frequency trans- 
forms of the image data; and 
control means for bit-shifting transform coeffi- 
cients, which correspond to the region of inter- 
est, of the transform coefficients generated by 
said transformation means to upper bit planes, 
stuffing zeros in blank fields outside the region 
of interest, which are generated by the bit shift 
process, and stuffing the information data in 
blank fields within the region of interest, which 
are generated by the bit shift process. 

2. The apparatus according to claim 1 , characterised 
by further comprising quantization means for quan- 
tizing the transform coefficients generated by said 
transformation means. 

3. The apparatus according to claim 1 or 2, charac- 
terised in that said transformation means executes 
discrete wavelet transformation. 

4. The apparatus according to any one of claims 1-3, 
characterised in that the information data is audio 
information. 

5. The apparatus according to any one of claims 1-3, 



characterised in that the information data is meta 
data which pertains to a description of the image 
data. 

5 6. The apparatus according to any one of claims 1-3, 
characterised in that the information data includes 
an Intellectual Property right information. 

7. The apparatus according to claim 1 , characterised 
10 by further comprising encoding means for decom- 
posing the transform coefficients stuffed by said 
control means into bit planes, and encoding the bit 
planes. 

15 8. An image processing apparatus characterised by 
comprising: 

encoded data input means for inputting encod- 
ed data; 

20 decoding means for decoding the encoded da- 

ta input by said encoded data input means; 
region of interest extraction means for extract- 
ing a region of interest from a decoding result 
decoded by said decoding means; and 

25 information data extraction means for extract- 

ing information data from lower bit planes of the 
region of interest extracted by said region of in- 
terest extraction means. 

30 9. An image processing method characterised by 

comprising: 

an image input step of inputting image data; 
an information input step of inputting informa- 

35 tion data; 

a region of interest setting step of setting a re- 
gion of interest on the basis of the image data; 
a transformation step of generating transform 
coefficients by computing frequency trans- 

40 t forms of the image data; and 

a control step of bit-shifting transform coeffi- 
cients, which correspond to the region of inter- 
est, of the transform coefficients to upper bit 
planes, stuffing zeros in blank fields outside the 

45 region of interest, which are generated by the 

bit shift process, and stuffing the information 
data in blank fields within the region of interest, 
which are generated by the bit shift process. 

50 10. The method according toclaim 9, characterised by 
further comprising a quantization step of quantizing 
the transform coefficients generated in said trans- 
formation step. 

55 11. The method according to claim 9 or 1 0, character- 
ised in that in said transformation step, a discrete 
wavelet transformation is executed. 



22 



43 



EP 1 162 573 A2 



44 



12. The method according to nay one of claims 9-11, 
characterised in that the information data is audio 
information. 

13. The method according to any one of claims 9-11, 
characterised in that the information data is meta 
data which pertains to a description of the image 
data. 

14. The method according to nay one of claims 9-11, 
characterised in that the information data is an In- 
tellectual Property right information. 

1 5. The method according to claim 9, characterised by 
further comprising an encoding step of decompos- 
ing the transform coefficients output in said control 
step into bit planes, and encoding the bit planes. 

16. An image processing method characterised by 

comprising the steps of: 

inputting encoded data; 

decoding the encoded data; 

extracting a region of interest from a decoding 

result decoded in said decoding step; and 

extracting information data from lower bit 

planes of the extracted region of interest. 

17. A computer readable storage medium character- 
ised by storing a program for implementing an im- 
age processing method according to claim 9 or 1 6. 

18. An image processing apparatus characterised by 

comprising: 

generation means for generating object image 
data which represents an object image, and 
background image data to be composed in a 
background of the object image; 
transformation means for generating first trans- 
form coefficients by computing frequency 
transforms of the object image data and the 
background image data corresponding to a re- 
gion outside a region of the object image, and 
generating second transform coefficients by 
computing frequency transforms of the back- 
ground image data corresponding to at least 
the region of the object image; and 
control means for bit-shifting bits, which corre- 
spond to the region of the object image, of the 
first transform coefficients to upper bit planes, 
stuffing zeros in blank fields outside the region, 
which are generated by the bit shift process, 
and stuffing the second transform coefficients 
corresponding to the interior of the region in 
blank fields within the region, which are gener- 
ated by the bit shift process. 



19. An image processing apparatus characterised by 
comprising: 

generation means for generating object image 
5 data which represents an object image, and 

background image data to be composed in a 
background of the object image; 
first transformation means for computing fre- 
quency transforms of the object image data and 
10 the background image data corresponding to a 

region outside a region of the object image; 
second transformation means for computing 
frequency transforms of the background image 
data corresponding to at least the region of the 
15 object image; and 

control means for bit-shifting bits, which corre- 
spond to the region of the object image, of the 
transform coefficients obtained by said first 
transformation means to upper bit planes, stuff- 
ing zeros in blank fields outside the region, 
which are generated by the bit shift process, 
and stuffing the transform coefficients, which 
are obtained by said second transformation 
means and correspond to the interior of the re- 
gion, in blank fields within the region, which are 
generated by the bit shift process. 

20. An image processing apparatus characterised by 

comprising: 

shape information extraction means for extract- 
ing shape information of an object from image 
data; 

object texture information extraction means for 
extracting texture information of the object from 
the image data; 

background texture information extraction 
means for extracting texture information of a 
background from the image data; 
first frequency transformation means for com- 
puting frequency transforms of the texture in- 
formation of the object and the texture informa- 
tion of the background on the basis of the shape 
information extracted by said shape informa- 
tion extraction means; 

second frequency transformation means for 
computing frequency transforms of the texture 
information of the background; 
stuffing means for stuffing zeros in a region out- 
side a region of the object on the basis of an 
output from said first frequency transformation 
means, and the shape information; and 
bit plane encoding means for decomposing an 
output from said stuffing means into bit planes 
55 and encoding the bit planes, and decomposing 

an output from said second frequency transfor- 
mation means into bit planes and encoding the 
bit planes. 
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21. The apparatus according to claim 20, character- 
ised by further comprising quantization means for 
quantizing transform coefficients computed by said 
first and second frequency transformation means. 

22. The apparatus according to claim 20, character- 
ised by further comprising image input means for 
inputting the image data on the basis' of a captured 
image. 

23. The apparatus according to any one of claims 20, 
characterised in that the image data is encoded 
data encoded by MPEG-4. 

24. The apparatus according to claim 20, character- 
ised in that at least one of said first and second 
frequency transformation means executes discrete 
wavelet transformation. 

25. The apparatus according to claim 20, character- 
ised by further comprising shape correction means 
for expanding the shape information on the basis of 
the shape information extracted by said shape in- 
formation extraction means, and a frequency trans- 
formation scheme of said first frequency transfor- 
mation means. 

26. An image processing apparatus characterised by 

comprising: 

input means for inputting encoded data; 
first bit plane decoding means for decoding a 
first group of bit planes; 
shape information extraction means for extract- 
ing shape information of an object from a de- 
coding result of said first bit plane decoding 
means; 

first inverse frequency transformation means 
for computing inverse frequency transforms of 
the decoding result of said first bit plane decod- 
ing means; 

object texture information extraction means for 
extracting texture information of the object from 
a transformation result of said first inverse fre- 
quency transformation means; 
second bit plane decoding means for decoding 
a second group of bit planes; 
second inverse frequency transformation 
means for computing inverse frequency trans- 
forms of a decoding result of said second bit 
plane decoding means; 

background texture information extraction 
means for extracting texture information of a 
background from a transformation result of said 
second inverse frequency transformation 
means; 

object shape information encoding means for 
generating object shape information encoded 



data by encoding the shape information of the 
object; 

object encoding means for generating texture 
encoded data of the object by encoding an out- 
5 put from said first inverse frequency transfor- 

mation means; 

background encoding means for generating 
texture encoded data of the background by en- 
coding a transformation result of said second 
10 inverse frequency transformation means; and 

output means for outputting, as object encoded 
data, the object shape encoded data, the tex- 
ture encoded data of the object, and the texture 
encoded data of the background. 

15 

27. An image processing apparatus characterised by 
comprising: 

input means for inputting encoded data; 
20 first bit plane decoding means for decoding a 

first group of bit planes; 

shape information extraction means for extract- 
ing shape information of an object from a de- 
coding result of said first bit plane decoding 
25 means; 

first inverse frequency transformation means 
for computing inverse frequency transforms of 
the decoding result of said first bit plane decod- 
ing means; 

30 second bit plane decoding means for decoding 

a second group of bit planes; 
object texture information extraction means for 
extracting texture information of the object from 
a transformation result of said first inverse fre- 

35 quency transformation means; 

second inverse frequency transformation 
means for computing inverse frequency trans- 
forms of a decoding result of said second bit 
plane decoding means; and 

40 background texture extraction means for ex- 

tracting texture information of a background 
from a transformation result of said second in- 
verse frequency transformation means. 

45 28. The apparatus according to claim 26 or 27, char- 
acterised by further comprising dequantization 
means for dequantizing the decoding results of said 
first and second bit plane decoding means. 

50 29. The apparatus according to claim 26 or 27, char- 
acterised in that said first and second inverse fre- 
quency transformation means execute inverse dis- 
crete wavelet transformation. 

55 30. The apparatus according to claim 26 or 27, char- 
acterised by further comprising shape information 
correction means for reducing the shape informa- 
tion on the basis of the shape information extracted 
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by said shape information extraction means, and a 
transformation scheme of said first and second in- 
verse frequency transformation means. 



33. An image processing method comprising: 

a generation step of generating object image 
data which represents an object image, and 
background image data to be composed in a 
background of the object image; 
a first transformation step of computing fre- 
quency transforms of the object image data and 
the background image data corresponding to a 
region outside a region of the object image; 
a second transformation step of computing fre- 
quency transforms of the background image 
data corresponding to at least the region of the 
object image; and 

a control step of bit-shifting bits, which corre- 
spond to the region of the object image, of the 
transform coefficients obtained in said first 
transformation step to upper bit planes, stuffing 
zeros in blank fields outside the region, which 
are generated by the bit shift process, and stuff- 
ing the transform coefficients, which are ob- 
tained in said second transformation step and 
correspond to the interior of the region, in blank 
fields within the region, which are generated by 



the bit shift process. 

34. An image processing method characterised by 

comprising: 

a shape information extraction step of extract- 
ing shape information of an object from image 
data; 

an object texture information extraction step of 
extracting texture information of the object from 
the image data; 

a background texture information extraction 
step of extracting texture information of a back- 
ground from the image data; 
a first frequency transformation step of comput- 
ing frequency transforms of the texture informa- 
tion of the object and the texture information of 
the background on the basis of the shape infor- 
mation extracted in said shape information ex- 
traction step; 

a second frequency transformation step of 
computing frequency transforms of the texture 
information of the background; 
a stuffing step of stuffing zeros in a region out- 
side a region of the object on the basis of an 
output by said first frequency transformation 
step, and the shape information; and 
a bit plane encoding step of decomposing an 
output of the stuffing step into bit planes and 
encoding the bit planes, and decomposing an 
output of the second frequency transformation 
step into bit planes and encoding the bit planes. 

35. The method according to claim 34, characterised 
35 by further comprising a quantization step of quan- 
tizing transform coefficients computed in said first 
and second frequency transformation steps. 

36. The method according to claim 34, characterised 
40 by further comprising an image input step of input- 
ting the image data on the basis of a captured im- 
age. 

37. The method according to claim 34, characterised 
45 in that the image data is encoded data encoded by 

MPEG-4. 

38. The method according to claim 34, characterised 
in that at least one of said first and second frequen- 
ce C y transformation steps, a discrete wavelet trans- 
formation is executed. 

39. The method according to claim 34, characterised 
by further comprising a shape correction step of ex- 

55 panding the shape information on the basts of the 

shape information extracted in said shape informa- 
tion extraction step, and a frequency transformation 
scheme in said first frequency transformation step. 



31. The apparatus according to any one of claims 5 
26-30, wherein the object encoded data is encoded 
data encoded by MPEG-4. 

32. An image processing method characterised by 

comprising: 10 

a step of generating object image data which 
represents an object image, and background 
image data to be composed in a background of 
the object image; 15 
a step of generating first transform coefficients 
by computing frequency transforms of the ob- 
ject image data and the background image data 
corresponding to a region outside a region of 
the object image, and generating second trans- 20 
form coefficients by computing frequency 
transforms of the background image data cor- 
responding to at least the region of the object 
image; and 

a control step of bit-shifting bits, which corre- 25 
spond to the region of the object image, of the 
first transform coefficients to upper bit planes, 
stuffing zeros in blank fields outside the region, 
which are generated by the bit shift process, 
and stuffing the second transform coefficients 30 
corresponding to the interior of the region in 
blank fields within the region, which are gener- 
ated by the bit shift process. 
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40. An image processing method comprising: 

an input step of inputting encoded data; 

a first bit plane decoding step of decoding a first 

group of bit planes; 

a shape information extraction step of extract- 
ing shape information of an object from a de- 
coding result of the first bit plane decoding step; 
a first inverse frequency transformation step of 
computing inverse frequency transforms of the 
decoding result of the first bit plane decoding 
step; 

an object texture information extraction step of 
extracting texture information of the object from 
a transformation result of the first inverse fre- 
quency transformation step; 
a second bit plane decoding step of decoding 
a second group of bit planes; 
a second inverse frequency transformation 
step of computing inverse frequency trans- 
forms of a decoding result in said second bit 
plane decoding step; 

a background texture information extraction 
step of extracting texture information of a back- 
ground from a transformation result in said sec- 
ond inverse frequency transformation step; 
an object shape information encoding step of 
generating object shape information encoded 
data by encoding the shape information of the 
object; 

an object encoding step of generating texture 
encoded data of the object by encoding an out- 
put from said first inverse frequency transfor- 
mation step; 

a background encoding step of generating tex- 
ture encoded data of the background by encod- 
ing a transformation result in said second in- 
verse frequency transformation step; and 
an output step of outputting, as object encoded 
data, the object shape encoded data, the tex- 
ture encoded data of the object, and the texture 
encoded data of the background. 

41. An image processing method characterised by 

comprising: 

a step of inputting encoded data; 

a first bit plane decoding step of decoding a first 

group of bit planes; 

a shape information extraction step of extract- 
ing shape information of an object from a de- 
coding result in said first bit plane decoding 
step; 

a first inverse frequency transformation step of 
computing inverse frequency transforms of the 
decoding result in said first bit plane decoding 
step; 

an object texture information extraction step of 



extracting texture information of the object from 
a transformation result in said first inverse fre- 
quency transformation step; 
a second bit plane decoding step of decoding 

5 a second group of bit planes; 

a second inverse frequency transformation 
step of computing inverse frequency trans- 
forms of a decoding result in said second bit 
plane decoding step; and 

10 a background texture extraction step of extract- 

ing texture information of a background from a 
transformation result in said second inverse fre- 
quency transformation step. 

'5 42. The method according to claim 40 or 41 , charac- 
terised by further comprising a dequantization step 
of dequantizing the decoding results in said first and 
second bit plane decoding steps. 

20 43. The method according to claim 40 or 41 , charac- 
terised in that in said first and second inverse fre- 
quency transformation steps, an inverse discrete 
wavelet transformation is executed. 

25 44. The method according to claim 40 or 41 , charac- 
terised by further comprising a shape information 
correction step of reducing the shape information 
on the basis of the shape information extracted in 
said shape information extraction step, and a trans- 

30 formation scheme at said first and second inverse 
frequency transformation steps. 

45. The method according to any one of claims 40-44, 
characterised in that the object encoded data is 

35 encoded data encoded by MPEG-4. 

46. A computer readable storage medium character- 
ised by storing a program for implementing an im- 
age processing method according to any one of 
claims 32-45. 

47. A computer program characterised by comprising: 

an image input program code for inputting im- 
age data; 

an information input program code for inputting 
information data; 

a region of interest setting program code for 
setting a region of interest on the basis of the 

50 image data; 

a transformation program code for generating 
transform coefficients by computing frequency 
transforms of the image data; and 
a control program code for bit-shifting trans- 

55 form coefficients, which correspond to the re- 

gion of interest, of the transform coefficients to 
upper bit planes, stuffing zeros in blank fields 
outside the region of interest, which are gener- 
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ated by the bit shift process, and stuffing the 
information data in blankfields within the region 
of interest, which are generated by the bit shift 
process. 

48. A computer program comprising: 

an encoded data input program code for input- 
ting encoded data; 

a decoding program code for decoding the en- 
coded data; 

a region of interest extraction program code for 
extracting a region of interest from the decoding 
result; and 

an information data extraction program code for 
extracting information data from lower bit 
planes of the extracted region of interest. 

49. A computer program comprising: 

a generation program code for generating ob- 
ject image data which represents an object im- 
age, and background image data to be com- 
posited in a background of the object image; 
a transformation program code for generating 
first transform coefficients by computing fre- 
quency transforms of the object image data and 
the background image data corresponding to a 
region outside a region of the object image, and 
generating second transform coefficients by 
computing frequency transforms of the back- 
ground image data corresponding to at least 
the region of the object image; and 
a control program code for bit-shifting bits, 
which correspond to the region of the object im- 
age, of the first transform coefficients to upper 
bit planes, stuffing zeros in blank fields outside 
the region, which are generated by the bit shift 
process, and stuffing the second transform co- 
efficients corresponding to the interior of the re- 
gion in blankfields within the region, which are 
generated by the bit shift process. 

50. A computer program comprising: 

a generation program code for generating ob- 
ject image data which represents an object im- 
age, and background image data to be com- 
posited in a background of the object image; 
a first transformation program code for comput- 
ing frequency transforms of the object image 
data and the background image data corre- 
sponding to a region outside a region of the ob- 
ject image; 

a second transformation program code for 
computing frequency transforms of the back- 
ground image data corresponding to at least 
the region of the object image; and 



a control program code for bit-shifting bits, 
which correspond to the region of the object im- 
age, of the obtained transform coefficients to 
upper bit planes, stuffing zeros in blank fields 

5 outside the region, which are generated by the 

bit shift process, and stuffing the transform co- 
efficients, which are obtained by executing the 
second transformation program code and cor- 
respond to the interior of the region, in blank 

10 fields within the region, which are generated by 

the bit shift process. 

51 . A computer program comprising: 

15 a shape information extraction program code 

for extracting shape information of an object 
from image data; 

an object texture information extraction pro- 
gram code for extracting texture information of 

20 the object from the image data; 

a background texture information extraction 
program code for extracting texture information 
of a background from the image data; 
a first frequency transformation program code 

25 for computing frequency transforms of the tex- 

ture information of the object and the texture 
information of the background on the basis of 
the shape information extracted by the shape 
information extraction program code; 

30 a second frequency transformation program 

code for computing frequency transforms of the 
texture information of the background; 
a stuffing program code for stuffing zeros in a 
region outside a region of the object on the ba- 

35 sis of an output of the first frequency transfor- 

mation program code, and the shape informa- 
tion; and 

a bit plane encoding program code for decom- 
posing an output of the stuffing program code 
40 into bit planes and encoding the bit planes, and 

decomposing an output of the second frequen- 
cy transformation program code into bit planes 
and encoding the bit planes. 

45 52. A computer program comprising: 

an input program code for inputting encoded 
data; 

a first bit plane decoding program code for de- 
50 coding first bit planes; 

a shape information extraction program code 

for extracting shape information of an object 

from a decoding result of the first bit planes; 

a first inverse frequency transformation pro- 
55 gram code for computing inverse frequency 

transforms of the decoding result of the first bit 

plane decoding program code; 

an object texture information extraction pro- 
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gram code for extracting texture information of 
the object from a transformation result of the 
first inverse frequency transformation program 
code; 

a second bit plane decoding program code for 5 
decoding second bit planes; 
a second inverse frequency transformation pro- 
gram code for computing inverse frequency 
transforms of a decoding result of the second 
bit plane decoding program code; 10 
a background texture information extraction 
program code for extracting texture information 
of a background from a transformation result of 
the second inverse frequency transformation 
program code; 75 
an object shape information encoding program 
code for generating object shape information 
encoded data by encoding the shape informa- 
tion of the object; 

an object encoding program code for generat- 20 
ing texture encoded data of the object by en- 
coding an output of the first inverse frequency 
transformation program code; 
a background encoding program code for gen- 
erating texture encoded data of the background 25 
by encoding an output of the second inverse 
frequency transformation program code; and 
an output program code for outputting, as ob- 
ject encoded data, the object shape encoded 
data, the texture encoded data of the object, 30 
and the texture encoded data of the back- 
ground. 

53. A computer program comprising: 

35 

a program code for inputting encoded data; 
a first bit plane decoding program code for de- 
coding first bit planes; 

a shape information extraction program code 
for extracting shape information of an object *<> 
from a decoding result of the first bit plane de- 
coding program code; 

a first inverse frequency transformation pro- 
gram code for computing inverse frequency 
transforms of the decoding result of the first bit *s 
plane decoding program code; 
an object texture information extraction pro- 
gram code for extracting texture information of 
the object from a transformation result of the 
first inverse frequency transformation program so 
code; 

a second inverse frequency transformation pro- 
gram code for computing inverse frequency 
transforms of a decoding result of a second bit 
plane decoding program code; and 55 
a background texture extraction program code 
for extracting texture information of a back- 
ground from a transformation result of the sec- 



ond inverse frequency transformation program 
code. 
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